Multi-modal medical image synthetics

Fu, Xingyue

Access status:

USyd Access

Field	Value	Language
dc.contributor.author	Fu, Xingyue
dc.date.accessioned	2026-03-02T08:54:41Z
dc.date.available	2026-03-02T08:54:41Z
dc.date.issued	2026	en
dc.identifier.uri	https://hdl.handle.net/2123/34912
dc.description.abstract	Deep learning has achieved expert-level performance in medical image analysis but remains constrained by data scarcity, class imbalance, and privacy issues, especially in domains like dental panoramic radiography (PR) and chest X-rays (CXR). This thesis explores the use of generative artificial intelligence (GAI)—including GANs, diffusion, and visual autoregressive (VAR) models—to address these challenges via the creation of synthetic medical images that are realistic, diverse, and privacy-preserving. The thesis comprises three interrelated studies: (1) A systematic review of publicly available synthetic medical image datasets, highlighting generation methods, quality evaluation strategies, and clinical use-cases, while identifying the need for standardized frameworks and reproducible practices. (2) A GAN-based pipeline for generating synthetic PRs with anatomical and disease-specific labels to support segmentation and classification tasks. Fusion strategies combining real and synthetic data demonstrated significant performance improvements under data-constrained and privacy-sensitive conditions. (3) A novel dual-stream autoregressive model that uses clinical text prompts to synthesize paired medical images and masks. The model incorporates a bidirectional cross-attention mechanism and two-stage fine-tuning to improve fidelity, alignment, and downstream segmentation performance in low-data settings. Together, these contributions demonstrate that task-aligned generative models can enhance robustness, fairness, and scalability in medical image analysis. This work lays a foundation for integrating synthetic data into privacy-conscious, multimodal clinical workflows.	en
dc.language.iso	en	en
dc.subject	Synthetic Medical Imaging	en
dc.subject	Data Augmentation	en
dc.subject	Generative Artificial Intelligence	en
dc.subject	Conditional Generation	en
dc.subject	Medical Imaging Analysis	en
dc.subject	Privacy Preservation	en
dc.title	Multi-modal medical image synthetics	en
dc.type	Thesis
dc.type.thesis	Masters by Research	en
dc.rights.other	The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.	en
usyd.faculty	SeS faculties schools::Faculty of Engineering::School of Computer Science	en
usyd.degree	Master of Philosophy M.Phil	en
usyd.awardinginst	The University of Sydney	en
usyd.advisor	Kim, Jinman
usyd.include.pub	No	en