Advancing Contrastive Learning for Visual Representation: From Unsupervised to Supervised Paradigms
Access status:
Open Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Zheng, MingkaiAbstract
Contrastive Visual Representation Learning has emerged as a cornerstone in computer vision research, celebrated for its simplicity and exceptional performance. At its core, this approach leverages instance discrimination, contrasting positive and negative samples from the same or ...
See moreContrastive Visual Representation Learning has emerged as a cornerstone in computer vision research, celebrated for its simplicity and exceptional performance. At its core, this approach leverages instance discrimination, contrasting positive and negative samples from the same or different instances, to learn effective representations. These representations have demonstrated remarkable efficacy across diverse visual tasks. This thesis delves deeper into the potential of contrastive-based visual representation learning, addressing key research questions: How can contrastive learning be optimized for various learning paradigms, including unsupervised/self-supervised, semi-supervised, and fully supervised settings? What novel algorithms and techniques can mitigate the typical challenges of contrastive learning? To answer these questions, we design and propose innovative methods to overcome challenges such as mitigating class collision problems in self-supervised settings, optimizing label efficiency in semi-supervised scenarios, and addressing intra-class disparities in fully supervised contexts. Comprehensive experiments and analyses are conducted on diverse visual datasets to validate the effectiveness and versatility of our approaches. Furthermore, this thesis explores the practical applications of fine-tuning contrastive pre-trained models, highlighting their potential in low-shot image classification, object detection, semantic segmentation, and other real-world tasks requiring robust visual representations. By advancing the theoretical and practical foundations of contrastive learning, this thesis contributes significantly to making visual representation learning more robust, versatile, and applicable to a broader spectrum of challenges in computer vision.
See less
See moreContrastive Visual Representation Learning has emerged as a cornerstone in computer vision research, celebrated for its simplicity and exceptional performance. At its core, this approach leverages instance discrimination, contrasting positive and negative samples from the same or different instances, to learn effective representations. These representations have demonstrated remarkable efficacy across diverse visual tasks. This thesis delves deeper into the potential of contrastive-based visual representation learning, addressing key research questions: How can contrastive learning be optimized for various learning paradigms, including unsupervised/self-supervised, semi-supervised, and fully supervised settings? What novel algorithms and techniques can mitigate the typical challenges of contrastive learning? To answer these questions, we design and propose innovative methods to overcome challenges such as mitigating class collision problems in self-supervised settings, optimizing label efficiency in semi-supervised scenarios, and addressing intra-class disparities in fully supervised contexts. Comprehensive experiments and analyses are conducted on diverse visual datasets to validate the effectiveness and versatility of our approaches. Furthermore, this thesis explores the practical applications of fine-tuning contrastive pre-trained models, highlighting their potential in low-shot image classification, object detection, semantic segmentation, and other real-world tasks requiring robust visual representations. By advancing the theoretical and practical foundations of contrastive learning, this thesis contributes significantly to making visual representation learning more robust, versatile, and applicable to a broader spectrum of challenges in computer vision.
See less
Date
2025Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering, School of Computer ScienceAwarding institution
The University of SydneyShare