Robust and Efficient Training of Deep Neural Networks via Principled Stopping Strategies

Yuan, Suqin

Permalink

Access status:

Open Access

Type

Thesis

Thesis type

Doctor of Philosophy

Author/s

Yuan, Suqin

Abstract

In many modern regimes, deep neural networks can overfit the training data while still generalizing well, which has enabled large-scale training and motivated neural scaling law. Nevertheless, benign overfitting is not universal. In practical scenarios, such as learning with noisy ...
See moreIn many modern regimes, deep neural networks can overfit the training data while still generalizing well, which has enabled large-scale training and motivated neural scaling law. Nevertheless, benign overfitting is not universal. In practical scenarios, such as learning with noisy labels or tight computational budgets, when to stop training (and, more generally, what to stop training on) remains a consequential and under-explored question. Stopping too late can amplify memorization of spurious patterns and waste computation, while stopping too early can prevent the model from acquiring useful features. This thesis moves toward principled stopping strategies by grounding stopping rules in training dynamics rather than relying on clean validation sets or hand-tuned schedules. We first study learning dynamics through the lens of memorization and forgetting. By tracking prediction trajectories over epochs, we identify a stage transition in which networks begin to substantially fit spurious or mislabeled patterns, accompanied by a distinctive change in aggregate forgetting behavior. Leveraging this transition, we propose validation-free criteria that select a reliable stopping point directly from training-time signals, requiring neither additional data nor expensive preprocessing. Beyond epoch-level decisions, we explore instance-level stopping for efficient and robust optimization. We adopt a perspective that estimates whether an example has been sufficiently learned and adaptively reduces its participation in later training, thereby reallocating computation toward still-unmastered instances. Finally, we show that stopping principles also inform robust learning under imperfect supervision. We investigate a criterion based on the evolution of label consistency to assess whether the model has effectively learned an instance, and use it to identify high-confidence clean samples in the presence of label noise.
See less

Date

2026

Rights statement

The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.

Faculty/School

Faculty of Engineering, School of Computer Science

Awarding institution

The University of Sydney

Subjects

Machine Learning
Deep Learning
Early Stopping