Show simple item record

FieldValueLanguage
dc.contributor.authorZhou, Junyu
dc.date.accessioned2025-08-15T06:45:13Z
dc.date.available2025-08-15T06:45:13Z
dc.date.issued2025en
dc.identifier.urihttps://hdl.handle.net/2123/34229
dc.description.abstractDeep neural networks (DNNs) have become central to modern machine learning due to their strong empirical performance. However, their theoretical understanding—especially regarding generalization—remains limited. This thesis advances the theory of deep ReLU networks through two lenses: pairwise learning tasks and gradient descent methods. For pairwise learning, we study generalization in non-parametric estimation without relying on restrictive convexity or VC-class assumptions. We establish sharp oracle inequalities for empirical minimizers under general hypothesis spaces and Lipschitz pairwise losses. Applied to pairwise least squares regression, our bounds match known minimax rates up to logarithmic terms. A key innovation is constructing a structured deep ReLU network approximating the true predictor, forming a target hypothesis space with controlled complexity. This framework successfully handles problems beyond the reach of existing theories. For metric and similarity learning, we exploit the structure of the true metric. By deriving its form under hinge loss, we approximate it using structured deep ReLU networks and analyze the excess generalization error by bounding the approximation and the estimation errors. An optimal excess risk rate is achieved, marking the first known such analysis for metric/similarity learning. We also explore extensions to general losses. For gradient descent methods, we study GD and SGD for overparameterized deep ReLU networks in the NTK regime. Prior work mainly covers shallow networks; we fill this gap by establishing the first minimax-optimal generalization rates for GD/SGD with deep architectures. Under polynomial width scaling, our results show these methods can match the generalization performance of kernel approaches.en
dc.language.isoenen
dc.rightsThe author retains copyright of this thesis
dc.subjectDeep Learning Theoryen
dc.subjectDeep Neural Networksen
dc.subjectGeneralization Analysisen
dc.subjectMetric and Similarity Learningen
dc.subjectStochastic Gradient Descenten
dc.titleLearning Ability of Deep ReLU Networks: Pairwise Tasks and Gradient Descent Methodsen
dc.typeThesis
dc.type.thesisDoctor of Philosophyen
dc.rights.otherThe author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.en
usyd.facultySeS faculties schools::Faculty of Science::School of Mathematics and Statisticsen
usyd.departmentMathematics and Statisticsen
usyd.degreeDoctor of Philosophy Ph.D.en
usyd.awardinginstThe University of Sydneyen
usyd.advisorZhou, Dingxuan


Show simple item record

Associated file/s

Associated collections

Show simple item record

There are no previous versions of the item available.