Label-Efficient Deep Learning with the Pre-trained Models
Access status:
USyd Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Wen, ZitingAbstract
The test error of deep learning models often follows a power-law decrease with the increase of training data and model size. A powerful model requires collecting a significant amount of training data. Data collection, especially the labeled data, is expensive and labor-intensive. ...
See moreThe test error of deep learning models often follows a power-law decrease with the increase of training data and model size. A powerful model requires collecting a significant amount of training data. Data collection, especially the labeled data, is expensive and labor-intensive. To address this issue, we propose to integrate active learning and semi-supervised learning on top of pre-training. Combining pre-trained models with active learning introduces challenges such as intensified phase transitions and increased training costs. The intensified phase transition implies that the types of samples to be selected change rapidly with the alteration of annotation quantity, so existing methods are effective only within a limited budget range of labeled data. To tackle this, we introduce a novel active learning strategy, Neural Tangent Kernel Clustering-Pseudo-Labels (NTKCPL), which allows active learning to directly estimate the model's empirical risk in the active learning pool. Additionally, based on the analysis of estimation errors, we propose a pseudo-label generation method to reduce the estimation error. Experimental results demonstrate that our approach outperforms existing methods and has a wider effective labeling budget range. Furthermore, since pre-trained models often large scale and require significant time to train, combining them with active learning significantly increases computational time. Therefore, we propose a new efficient active learning framework that improves active learning accuracy by aligning training methods and active learning features. Moreover, to further diminish the need for manual annotations, we introduce Active Self-Semi-Supervised Learning (AS3L). By comprehensively utilizing pre-training weight initialization, active sample selection, and semi-supervised learning guided by prior pseudo-labels, we greatly enhance the model performance when dealing with limited labeled data.
See less
See moreThe test error of deep learning models often follows a power-law decrease with the increase of training data and model size. A powerful model requires collecting a significant amount of training data. Data collection, especially the labeled data, is expensive and labor-intensive. To address this issue, we propose to integrate active learning and semi-supervised learning on top of pre-training. Combining pre-trained models with active learning introduces challenges such as intensified phase transitions and increased training costs. The intensified phase transition implies that the types of samples to be selected change rapidly with the alteration of annotation quantity, so existing methods are effective only within a limited budget range of labeled data. To tackle this, we introduce a novel active learning strategy, Neural Tangent Kernel Clustering-Pseudo-Labels (NTKCPL), which allows active learning to directly estimate the model's empirical risk in the active learning pool. Additionally, based on the analysis of estimation errors, we propose a pseudo-label generation method to reduce the estimation error. Experimental results demonstrate that our approach outperforms existing methods and has a wider effective labeling budget range. Furthermore, since pre-trained models often large scale and require significant time to train, combining them with active learning significantly increases computational time. Therefore, we propose a new efficient active learning framework that improves active learning accuracy by aligning training methods and active learning features. Moreover, to further diminish the need for manual annotations, we introduce Active Self-Semi-Supervised Learning (AS3L). By comprehensively utilizing pre-training weight initialization, active sample selection, and semi-supervised learning guided by prior pseudo-labels, we greatly enhance the model performance when dealing with limited labeled data.
See less
Date
2025Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering, School of Aerospace Mechanical and Mechatronic EngineeringAwarding institution
The University of SydneyShare