Show simple item record

FieldValueLanguage
dc.contributor.authorChen, Xuanyu
dc.date.accessioned2026-03-30T03:12:01Z
dc.date.available2026-03-30T03:12:01Z
dc.date.issued2026en
dc.identifier.urihttps://hdl.handle.net/2123/35050
dc.description.abstractThis thesis investigates distributed self-supervised learning as a paradigm for training visual representation models directly from distributed and unlabelled data. While large-scale supervised datasets have driven advances in visual artificial intelligence, their centralised collection and annotation are costly, and real-world data is inherently fragmented across edge devices, institutions, and sensors. Distributed self-supervised learning aims to leverage such data without labels or central coordination, raising key challenges including robustness under heterogeneous client distributions, feasibility on resource-limited devices, and whether distributed training can approach centralised performance. This thesis makes four main contributions. First, it shows how decentralisation reshapes scaling laws, proving that the compute-optimal model size decreases as data becomes more distributed, explaining the effectiveness of lightweight models on edge devices. Second, it demonstrates that distributed training exhibits a generalisation gap compared to centralised training under equal compute, which can be reduced by increasing data through more clients or larger local datasets. Third, it provides a systematic theoretical analysis under heterogeneous data, showing that Masked Image Modelling is more robust than contrastive learning, with robustness improving with network connectivity, and introduces MAR loss to enhance robustness. Finally, it proposes DeNAV, a decentralised framework that removes server dependence and incorporates informed client selection, staleness-aware aggregation, and lightweight masked autoencoder pre-training for efficient communication, with theoretical guarantees and strong empirical performance. Overall, this thesis establishes distributed self-supervised learning as a feasible and effective paradigm and shows how theoretical insights on scaling laws, generalisation, and robustness guide the design of practical distributed learning systems.en
dc.language.isoenen
dc.subjectDistributed Self-Supervised Learningen
dc.subjectFederated Learningen
dc.subjectDecentralized Learningen
dc.subjectHeterogeneous Dataen
dc.subjectScaling Lawsen
dc.subjectPAC-Bayesian Generalisation Analysisen
dc.titleSelf-Supervised Visual Representation Learning in Distributed Systems - Generalisation Analysis and Training Framework Designen
dc.typeThesis
dc.type.thesisDoctor of Philosophyen
dc.rights.otherThe author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.en
usyd.facultySeS faculties schools::Faculty of Engineering::School of Electrical and Information Engineeringen
usyd.degreeDoctor of Philosophy Ph.D.en
usyd.awardinginstThe University of Sydneyen
usyd.advisorYuan, Dong
usyd.include.pubNoen


Show simple item record

Associated file/s

Associated collections

Show simple item record

There are no previous versions of the item available.