Learning Deep Representations from Unlabelled Data for Visual Recognition
Access status:
USyd Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Feng, ZeyuAbstract
Self-supervised learning (SSL) aims at extracting from abundant unlabelled images transferable semantic features, which benefit various downstream visual tasks by reducing the sample complexity when human annotated labels are scarce. SSL is promising because it also boosts performance ...
See moreSelf-supervised learning (SSL) aims at extracting from abundant unlabelled images transferable semantic features, which benefit various downstream visual tasks by reducing the sample complexity when human annotated labels are scarce. SSL is promising because it also boosts performance in diverse tasks when combined with the knowledge of existing techniques. Therefore, it is important and meaningful to study how SSL leads to better transferability and design novel SSL methods. To this end, this thesis proposes several methods to improve SSL and its function in downstream tasks. We begin by investigating the effect of unlabelled training data, and introduce an information-theoretical constraint for SSL from multiple related domains. In contrast to conventional single dataset, exploiting multi-domains has the benefits of decreasing the build-in bias of individual domain and allowing knowledge transfer across domains. Thus, the learned representation is more unbiased and transferable. Next, we describe a feature decoupling (FD) framework that incorporates invariance into predicting transformations, one main category of SSL methods, by observing that they often lead to co-variant features unfavourable for transfer. Our model learns a split representation that contains both transformation related and unrelated parts. FD achieves SOTA results on SSL benchmarks. We also present a multi-task method with theoretical understanding for contrastive learning, the other main category of SSL, by leveraging the semantic information from synthetic images to facilitate the learning of class-related semantics. Finally, we explore self-supervision in open-set unsupervised classification with the knowledge of source domain. We propose to enforce consistency under transformation of target data and discover pseudo-labels from confident predictions. Experimental results outperform SOTA open-set domain adaptation methods.
See less
See moreSelf-supervised learning (SSL) aims at extracting from abundant unlabelled images transferable semantic features, which benefit various downstream visual tasks by reducing the sample complexity when human annotated labels are scarce. SSL is promising because it also boosts performance in diverse tasks when combined with the knowledge of existing techniques. Therefore, it is important and meaningful to study how SSL leads to better transferability and design novel SSL methods. To this end, this thesis proposes several methods to improve SSL and its function in downstream tasks. We begin by investigating the effect of unlabelled training data, and introduce an information-theoretical constraint for SSL from multiple related domains. In contrast to conventional single dataset, exploiting multi-domains has the benefits of decreasing the build-in bias of individual domain and allowing knowledge transfer across domains. Thus, the learned representation is more unbiased and transferable. Next, we describe a feature decoupling (FD) framework that incorporates invariance into predicting transformations, one main category of SSL methods, by observing that they often lead to co-variant features unfavourable for transfer. Our model learns a split representation that contains both transformation related and unrelated parts. FD achieves SOTA results on SSL benchmarks. We also present a multi-task method with theoretical understanding for contrastive learning, the other main category of SSL, by leveraging the semantic information from synthetic images to facilitate the learning of class-related semantics. Finally, we explore self-supervision in open-set unsupervised classification with the knowledge of source domain. We propose to enforce consistency under transformation of target data and discover pseudo-labels from confident predictions. Experimental results outperform SOTA open-set domain adaptation methods.
See less
Date
2021Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering, School of Computer ScienceAwarding institution
The University of SydneyShare