Show simple item record

FieldValueLanguage
dc.contributor.authorYang, Xianghui
dc.date.accessioned2024-05-08T05:56:38Z
dc.date.available2024-05-08T05:56:38Z
dc.date.issued2024en_AU
dc.identifier.urihttps://hdl.handle.net/2123/32537
dc.descriptionIncludes publication
dc.description.abstractThe field of artificial intelligence has witnessed remarkable progress due to the advancement of deep neural networks. However, improving the ability of deep learning models to perform well on out-of-the-distribution data remains a major challenge. This doctoral thesis focuses on addressing the issue of generalization in deep learning concerning for 2D, 2D-to-3D, and 3D tasks. For the 2D task of few-shot semantic segmentation, we propose a novel framework named BriNet with two key contributions. Firstly, we introduce an information exchange module that adeptly augments the feature representations of both support and query images, and we devise a more fine-grained way to better localize the objects in the query image. Second, we propose a new online refinement strategy to adapt the trained model to unseen test objects. Shifting from the 2D task to the 2D-to-3D task, specifically single-view 3D mesh reconstruction, we present a novel framework, GenMesh, with three strategies to improve the model generalization ability on novel classes and prevent overfitting, namely, learning intermediate point cloud representation, employing local features, and introducing multi-view silhouette loss for model regularization. In the context of the 3D task, i.e., surface reconstruction, we introduce a novel 3D representation called Neural Vector Fields (NVF). Leveraging this innovative representation, we present two frameworks that utilize cross-category information to enhance the generalization on novel classes. The first framework, NVF (lite), employs a hard codebook, serving as a precursor to progress. This is followed by NVF (ultra), which incorporates a soft codebook and introduces zero-curl and direction regularization, further enhancing generalization capabilities. Extensive experiments conducted in this thesis validate the effectiveness of the proposed methodologies in improving the generalization capacity of deep learning models across various tasks.en_AU
dc.language.isoenen_AU
dc.subjectcomputer visionen_AU
dc.subjectdeep learningen_AU
dc.subjectmodel generalizationen_AU
dc.subjectfew-shot segmentationen_AU
dc.subject3D reconstructionen_AU
dc.subjectnovel dataen_AU
dc.titleEnhancing Novel-class Generalization of Deep Learning Models for Vision Tasksen_AU
dc.typeThesis
dc.type.thesisDoctor of Philosophyen_AU
dc.rights.otherThe author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.en_AU
usyd.facultySeS faculties schools::Faculty of Engineering::School of Electrical and Information Engineeringen_AU
usyd.degreeDoctor of Philosophy Ph.D.en_AU
usyd.awardinginstThe University of Sydneyen_AU
usyd.advisorZhou, Luping
usyd.include.pubYesen_AU


Show simple item record

Associated file/s

Associated collections

Show simple item record

There are no previous versions of the item available.