Multi-modal medical image synthetics

Fu, Xingyue

Permalink

Access status:

USyd Access

Type

Thesis

Thesis type

Masters by Research

Author/s

Fu, Xingyue

Abstract

Deep learning has achieved expert-level performance in medical image analysis but remains constrained by data scarcity, class imbalance, and privacy issues, especially in domains like dental panoramic radiography (PR) and chest X-rays (CXR). This thesis explores the use of generative ...
See moreDeep learning has achieved expert-level performance in medical image analysis but remains constrained by data scarcity, class imbalance, and privacy issues, especially in domains like dental panoramic radiography (PR) and chest X-rays (CXR). This thesis explores the use of generative artificial intelligence (GAI)—including GANs, diffusion, and visual autoregressive (VAR) models—to address these challenges via the creation of synthetic medical images that are realistic, diverse, and privacy-preserving. The thesis comprises three interrelated studies: (1) A systematic review of publicly available synthetic medical image datasets, highlighting generation methods, quality evaluation strategies, and clinical use-cases, while identifying the need for standardized frameworks and reproducible practices. (2) A GAN-based pipeline for generating synthetic PRs with anatomical and disease-specific labels to support segmentation and classification tasks. Fusion strategies combining real and synthetic data demonstrated significant performance improvements under data-constrained and privacy-sensitive conditions. (3) A novel dual-stream autoregressive model that uses clinical text prompts to synthesize paired medical images and masks. The model incorporates a bidirectional cross-attention mechanism and two-stage fine-tuning to improve fidelity, alignment, and downstream segmentation performance in low-data settings. Together, these contributions demonstrate that task-aligned generative models can enhance robustness, fairness, and scalability in medical image analysis. This work lays a foundation for integrating synthetic data into privacy-conscious, multimodal clinical workflows.
See less

Date

2026

Rights statement

The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.

Faculty/School

Faculty of Engineering, School of Civil Engineering

Awarding institution

The University of Sydney

Subjects

Synthetic Medical Imaging
Data Augmentation
Generative Artificial Intelligence
Conditional Generation
Medical Imaging Analysis
Privacy Preservation