Advancing Contrastive Learning for Visual Representation: From Unsupervised to Supervised Paradigms

Zheng, Mingkai

Permalink

Access status:

Open Access

Type

Thesis

Thesis type

Doctor of Philosophy

Author/s

Zheng, Mingkai

Abstract

Contrastive Visual Representation Learning has emerged as a cornerstone in computer vision research, celebrated for its simplicity and exceptional performance. At its core, this approach leverages instance discrimination, contrasting positive and negative samples from the same or ...
See moreContrastive Visual Representation Learning has emerged as a cornerstone in computer vision research, celebrated for its simplicity and exceptional performance. At its core, this approach leverages instance discrimination, contrasting positive and negative samples from the same or different instances, to learn effective representations. These representations have demonstrated remarkable efficacy across diverse visual tasks. This thesis delves deeper into the potential of contrastive-based visual representation learning, addressing key research questions: How can contrastive learning be optimized for various learning paradigms, including unsupervised/self-supervised, semi-supervised, and fully supervised settings? What novel algorithms and techniques can mitigate the typical challenges of contrastive learning? To answer these questions, we design and propose innovative methods to overcome challenges such as mitigating class collision problems in self-supervised settings, optimizing label efficiency in semi-supervised scenarios, and addressing intra-class disparities in fully supervised contexts. Comprehensive experiments and analyses are conducted on diverse visual datasets to validate the effectiveness and versatility of our approaches. Furthermore, this thesis explores the practical applications of fine-tuning contrastive pre-trained models, highlighting their potential in low-shot image classification, object detection, semantic segmentation, and other real-world tasks requiring robust visual representations. By advancing the theoretical and practical foundations of contrastive learning, this thesis contributes significantly to making visual representation learning more robust, versatile, and applicable to a broader spectrum of challenges in computer vision.
See less

Date

2025

Rights statement

The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.

Faculty/School

Faculty of Engineering, School of Computer Science

Awarding institution

The University of Sydney

Subjects

Computer Vision
Machine Learning
Visual Representation Learning
Self-supervised Learning
Contrastive Learning