Sequence learning using deep neural networks with flexibility and interpretability

Li, Chang

Access status:

Open Access

Field	Value	Language
dc.contributor.author	Li, Chang
dc.date.accessioned	2021-06-08T23:30:48Z
dc.date.available	2021-06-08T23:30:48Z
dc.date.issued	2021	en_AU
dc.identifier.uri	https://hdl.handle.net/2123/25390
dc.description	includes published articles
dc.description.abstract	Throughout this thesis, I investigate two long-standing yet rarely explored sequence learning challenges under the Probabilistic Graphical Models (PGMs) framework: learning multi-timescale representations on a single sequence and learning higher-order dynamics between multi-sequences. The first challenge is tackled with Hidden Markov Models (HMMs), a type of directed PGMs, under the reinforcement learning framework. I prove that the Semi-Markov Decision Problem (SMDP) formulated option framework [Sutton et al., 1999, Bacon et al., 2017, Zhang and Whiteson, 2019], one of the most promising Hierarchical Reinforcement Learning (HRL) frameworks, has a Markov Decision Problem (MDP) equivalence. Based on this equivalence, a simple yet effective Skill-Action (SA) architecture is proposed. Our empirical studies on challenging robot simulation environments demonstrate that SA significantly outperforms all baselines on both infinite horizon and transfer learning environments. Because of its exceptional scalability, SA gives rise to a large scale pre-training architecture in reinforcement learning. The second challenge is tackled with Markov Random Fields (MRFs), also known as undirected PGMs, under the supervised learning framework. I employ binary MRFs with weighted Lower Linear Envelope Potentials (LLEPs) to capture higher-order dependencies. I propose an exact inference algorithm under the graph-cuts framework and an efficient learning algorithm under the Latent Structural Support Vector Machines (LSSVMs) framework. In order to learn higher-order latent dynamics on time series, we layer multi-task recurrent neural networks (RNNs) on top of Markov random fields (MRFs). A sub-gradient algorithm is employed to perform end-to-end training. We conduct thorough empirical studies on three popular Chinese stock market indexes and the proposed method outperforms all baselines. To our best knowledge, the proposed technique is the first to investigate higher-order dynamics between stocks.	en_AU
dc.language.iso	en	en_AU
dc.subject	reinforcement learning	en_AU
dc.subject	temporal abstraction	en_AU
dc.subject	artificial intelligence	en_AU
dc.title	Sequence learning using deep neural networks with flexibility and interpretability	en_AU
dc.type	Thesis
dc.type.thesis	Doctor of Philosophy	en_AU
dc.rights.other	The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.	en_AU
usyd.faculty	SeS faculties schools::Faculty of Engineering::School of Computer Science	en_AU
usyd.degree	Doctor of Philosophy Ph.D.	en_AU
usyd.awardinginst	The University of Sydney	en_AU
usyd.advisor	TAO, DACHENG