Show simple item record

FieldValueLanguage
dc.contributor.authorBawa, Payal
dc.date.accessioned2023-09-05T03:05:15Z
dc.date.available2023-09-05T03:05:15Z
dc.date.issued2023en
dc.identifier.urihttps://hdl.handle.net/2123/31647
dc.description.abstractSignificant progress in Deep Learning, has helped Deep Reinforcement Learning (RL) algorithms achieve human like performance across applications ranging from games like Go and chess to simple robotic tasks. Off-policy Deep RL algorithms in particular have shown promising results on a wide range of simulated tasks. However, they are encumbered by stability concerns thus preventing their real-world deployment. This thesis makes several contributions toward developing off-policy deep RL algorithms that are robust, scalable, sample efficient and suitable for safety critical applications. Our first contribution is Bagged Critic for Continuous Control (BC3). BC3 mitigates overestimation bias in off-policy actor-critic algorithms by employing an ensemble of state-value functions. Our second contribution is Spctral Normalized Actor Critic (SNAC). SNAC bounds the Lipschitz constants of the actor-critic networks in off-policy algorithms which in return bound the gradients flowing the network. Bounded gradients help RL agorithms learn more robust and sample efficient policies. Our last contribution is orthogonality constrained actor critic algorithms. Enforcing orthogonality on the weight matrices of actor critic networks helps preserve the norm of the gradients thus preventing vanishing gradients and avoiding convergence to suboptimal polices.en
dc.language.isoenen
dc.subjectDeep Reinforcement Learningen
dc.subjectDeep Learningen
dc.subjectReinforcement learningen
dc.titleRobust Off-Policy Deep Reinforcement Learningen
dc.typeThesis
dc.type.thesisDoctor of Philosophyen
dc.rights.otherThe author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.en
usyd.facultySeS faculties schools::Faculty of Engineering::School of Civil Engineeringen
usyd.degreeDoctor of Philosophy Ph.D.en
usyd.awardinginstThe University of Sydneyen
usyd.advisorRamos, Fabio


Show simple item record

Associated file/s

Associated collections

Show simple item record

There are no previous versions of the item available.