Show simple item record

FieldValueLanguage
dc.contributor.authorJie, Renlong
dc.date.accessioned2021-01-21
dc.date.available2021-01-21
dc.date.issued2020en
dc.identifier.urihttps://hdl.handle.net/2123/24342
dc.description.abstractOnline data streaming has become one of the most common data forms in the modern world, also people use online approaches to improve the efficiency for model training, which imposes a strong demand of developing hyper-parameter optimization or architecture adaptation techniques for online learning. The thesis contains four projects on this. The first project is about online parallel hyper-parameter optimization and model training on data streams. A framework called HyperTube is proposed for online hyper-parameter optimization given the limited computing resources. This study also introduces “micro-mini-batch training mechanism” to reuse the online data mini-batches in a relatively efficient way. The second study is online adaptation of activation functions, in which I propose a general combined form of flexible activation functions as well as three principles of choosing flexible activation component. Based on this, two novel flexible activation functions with bounded or unbounded outputs are developed. Also, two new regularisation terms based on assumptions as prior knowledge are proposed. The third study is about online learning rate adaptation, in which I investigate different levels of learning rate adaptation based on the framework of hyper-gradient descent. Based on this, I propose an optimization method that adaptively learns the combination weights for different levels of adaptive learning rates. In the fourth study, I introduce a growing mechanism for differentiable neural architecture search based on network morphism. It enables growing of the cell structures from small size towards large size ones with one-shot training. Two modes can be applied in integrating the growing and original pruning process. Also, a novel two-input backbone architecture is proposed for recurrent neural networks. The proposed methods are well supported by experiments and could contribute to future studies for improving the efficiency of deep learning methods.en
dc.language.isoenen
dc.publisherUniversity of Sydneyen
dc.subjectDeep Learningen
dc.subjectNeural Networksen
dc.subjectOnline Data Streamsen
dc.subjectHyper-parameteren
dc.subjectOptimizationen
dc.subjectNeural Architecture Searchen
dc.titleOnline Architecture Optimization for Deep Neural Networksen
dc.typeThesis
dc.type.thesisDoctor of Philosophyen
dc.rights.otherThe author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.en
usyd.facultySeS faculties schools::The University of Sydney Business School::Discipline of Business Analyticsen
usyd.degreeDoctor of Philosophy Ph.D.en
usyd.awardinginstThe University of Sydneyen
usyd.advisorGAO, JUNBIN
usyd.advisorVASNEV, ANDREY


Show simple item record

Associated file/s

Associated collections

Show simple item record

There are no previous versions of the item available.