Self-Supervised Intrinsic Representation Learning on Medical Images for Universal Foundation Models

Ma, Yang

Access status:

USyd Access

Field	Value	Language
dc.contributor.author	Ma, Yang
dc.date.accessioned	2026-03-03T22:36:43Z
dc.date.available	2026-03-03T22:36:43Z
dc.date.issued	2025	en
dc.identifier.uri	https://hdl.handle.net/2123/34930
dc.description	Includes publication
dc.description.abstract	Deep learning has advanced medical imaging, yet clinical adoption is limited by scarce annotations, weak interpretability, and poor generalization. A key limitation of existing methods is their purely data-driven nature, which overlooks intrinsic properties of medical images such as anatomical symmetry, spatial hierarchies, and disease-specific structural priors. This thesis proposes a unified framework that embeds these priors into model architectures and training objectives, improving data efficiency, robustness, and interpretability. First, I introduce a Symmetry-Aware Cross-Attention (SACA) module for brain disease diagnosis with limited supervision. By performing cross-attention between original and flipped 3D brain volumes and applying contrastive pretraining, SACA captures hemispheric asymmetries and yields improved classification and lesion segmentation on multi-center MRI datasets. Second, I develop MSFormer, a multi-scale vision–language transformer for medical visual question answering. With Multi-Scale Positional Embedding and Grouped Attention, MSFormer fuses coarse anatomical context with fine-grained lesion details and aligns visual evidence with clinical questions, achieving state-of-the-art performance on medical VQA benchmarks. Finally, I propose MSCAMA, a scale-aware multi-agent framework that unifies hierarchical visual encoding, adaptive retrieval, and evidence-grounded reasoning using a scale-aware backbone and specialized agents for report understanding, pairwise comparison, and question answering. Experiments demonstrate consistent improvements in retrieval accuracy, clinical relevance, and reasoning quality. Together, these contributions advance interpretable, generalizable, and data-efficient medical imaging AI.	en
dc.language.iso	en	en
dc.subject	Medical Imaging	en
dc.subject	Self-supervised Learning	en
dc.subject	Foundation Models	en
dc.title	Self-Supervised Intrinsic Representation Learning on Medical Images for Universal Foundation Models	en
dc.type	Thesis
dc.type.thesis	Doctor of Philosophy	en
dc.rights.other	The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.	en
usyd.faculty	SeS faculties schools::Faculty of Engineering	en
usyd.degree	Doctor of Philosophy Ph.D.	en
usyd.awardinginst	The University of Sydney	en
usyd.advisor	Cai, Weidong
usyd.include.pub	Yes	en