Multi-modal medical image synthetics
Access status:
USyd Access
Type
ThesisThesis type
Masters by ResearchAuthor/s
Fu, XingyueAbstract
Deep learning has achieved expert-level performance in medical image analysis but remains constrained by data scarcity, class imbalance, and privacy issues, especially in domains like dental panoramic radiography (PR) and chest X-rays (CXR). This thesis explores the use of generative ...
See moreDeep learning has achieved expert-level performance in medical image analysis but remains constrained by data scarcity, class imbalance, and privacy issues, especially in domains like dental panoramic radiography (PR) and chest X-rays (CXR). This thesis explores the use of generative artificial intelligence (GAI)—including GANs, diffusion, and visual autoregressive (VAR) models—to address these challenges via the creation of synthetic medical images that are realistic, diverse, and privacy-preserving. The thesis comprises three interrelated studies: (1) A systematic review of publicly available synthetic medical image datasets, highlighting generation methods, quality evaluation strategies, and clinical use-cases, while identifying the need for standardized frameworks and reproducible practices. (2) A GAN-based pipeline for generating synthetic PRs with anatomical and disease-specific labels to support segmentation and classification tasks. Fusion strategies combining real and synthetic data demonstrated significant performance improvements under data-constrained and privacy-sensitive conditions. (3) A novel dual-stream autoregressive model that uses clinical text prompts to synthesize paired medical images and masks. The model incorporates a bidirectional cross-attention mechanism and two-stage fine-tuning to improve fidelity, alignment, and downstream segmentation performance in low-data settings. Together, these contributions demonstrate that task-aligned generative models can enhance robustness, fairness, and scalability in medical image analysis. This work lays a foundation for integrating synthetic data into privacy-conscious, multimodal clinical workflows.
See less
See moreDeep learning has achieved expert-level performance in medical image analysis but remains constrained by data scarcity, class imbalance, and privacy issues, especially in domains like dental panoramic radiography (PR) and chest X-rays (CXR). This thesis explores the use of generative artificial intelligence (GAI)—including GANs, diffusion, and visual autoregressive (VAR) models—to address these challenges via the creation of synthetic medical images that are realistic, diverse, and privacy-preserving. The thesis comprises three interrelated studies: (1) A systematic review of publicly available synthetic medical image datasets, highlighting generation methods, quality evaluation strategies, and clinical use-cases, while identifying the need for standardized frameworks and reproducible practices. (2) A GAN-based pipeline for generating synthetic PRs with anatomical and disease-specific labels to support segmentation and classification tasks. Fusion strategies combining real and synthetic data demonstrated significant performance improvements under data-constrained and privacy-sensitive conditions. (3) A novel dual-stream autoregressive model that uses clinical text prompts to synthesize paired medical images and masks. The model incorporates a bidirectional cross-attention mechanism and two-stage fine-tuning to improve fidelity, alignment, and downstream segmentation performance in low-data settings. Together, these contributions demonstrate that task-aligned generative models can enhance robustness, fairness, and scalability in medical image analysis. This work lays a foundation for integrating synthetic data into privacy-conscious, multimodal clinical workflows.
See less
Date
2026Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering, School of Computer ScienceAwarding institution
The University of SydneyShare