Advances in Imperfect Supervision: From Multiple Unlabeled Sets to Weakly-Annotated Graphs

Wu, Yuhao

Permalink

Access status:

Open Access

Type

Thesis

Thesis type

Doctor of Philosophy

Author/s

Wu, Yuhao

Abstract

Supervised machine learning has been a major driver of progress in artificial intelligence, powering applications across domains such as healthcare, robotics, and related fields. The methods from supervised learning typically rely on large datasets with accurate labels. However, ...
See moreSupervised machine learning has been a major driver of progress in artificial intelligence, powering applications across domains such as healthcare, robotics, and related fields. The methods from supervised learning typically rely on large datasets with accurate labels. However, in real-world settings, such perfectly labeled data is often unrealistic due to imperfections in data collection, including limited availability, missing values, and annotation errors. These challenges have led to the development of reliable and robust approaches that can effectively handle imperfect supervision, which commonly arises in three forms: inexact, incomplete, and inaccurate supervision. This thesis investigates advanced topics spanning these three core forms of imperfect supervision. For inexact supervision, we introduce a novel problem setting for binary classification using multiple unlabeled datasets, which relies on minimal and easily obtainable supervision signals. In the context of incomplete supervision, we focus on graph-based positive-unlabeled (PU) learning and reveal how the structural characteristics of graphs can violate key assumptions of conventional PU approaches. Under inaccurate supervision, we tackle the problem of label noise in graph data by proposing a topological sample selection approach that leverages graph structure to identify clean and informative nodes more effectively. Together, this thesis advances the understanding and capability of machine learning under imperfect supervision, particularly in structurally complex environments such as graphs. By systematically addressing challenges across inexact, incomplete, and inaccurate supervision, the proposed methodologies bridge theoretical principles with practical implementation and pave the way for more robust, adaptable, and trustworthy learning systems even when training data is coarse, partial, or noisy.
See less

Date

2025

Rights statement

The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.

Faculty/School

Faculty of Engineering, School of Computer Science

Awarding institution

The University of Sydney

Subjects

Imperfect Supervision
Positive-Unlabeled Learning
Weakly Supervised Learning
Graph Neural Networks
Class-Prior Estimation
Label Noise Robustness