Multi-modal medical image synthetics
| Field | Value | Language |
| dc.contributor.author | Fu, Xingyue | |
| dc.date.accessioned | 2026-03-02T08:54:41Z | |
| dc.date.available | 2026-03-02T08:54:41Z | |
| dc.date.issued | 2026 | en |
| dc.identifier.uri | https://hdl.handle.net/2123/34912 | |
| dc.description.abstract | Deep learning has achieved expert-level performance in medical image analysis but remains constrained by data scarcity, class imbalance, and privacy issues, especially in domains like dental panoramic radiography (PR) and chest X-rays (CXR). This thesis explores the use of generative artificial intelligence (GAI)—including GANs, diffusion, and visual autoregressive (VAR) models—to address these challenges via the creation of synthetic medical images that are realistic, diverse, and privacy-preserving. The thesis comprises three interrelated studies: (1) A systematic review of publicly available synthetic medical image datasets, highlighting generation methods, quality evaluation strategies, and clinical use-cases, while identifying the need for standardized frameworks and reproducible practices. (2) A GAN-based pipeline for generating synthetic PRs with anatomical and disease-specific labels to support segmentation and classification tasks. Fusion strategies combining real and synthetic data demonstrated significant performance improvements under data-constrained and privacy-sensitive conditions. (3) A novel dual-stream autoregressive model that uses clinical text prompts to synthesize paired medical images and masks. The model incorporates a bidirectional cross-attention mechanism and two-stage fine-tuning to improve fidelity, alignment, and downstream segmentation performance in low-data settings. Together, these contributions demonstrate that task-aligned generative models can enhance robustness, fairness, and scalability in medical image analysis. This work lays a foundation for integrating synthetic data into privacy-conscious, multimodal clinical workflows. | en |
| dc.language.iso | en | en |
| dc.subject | Synthetic Medical Imaging | en |
| dc.subject | Data Augmentation | en |
| dc.subject | Generative Artificial Intelligence | en |
| dc.subject | Conditional Generation | en |
| dc.subject | Medical Imaging Analysis | en |
| dc.subject | Privacy Preservation | en |
| dc.title | Multi-modal medical image synthetics | en |
| dc.type | Thesis | |
| dc.type.thesis | Masters by Research | en |
| dc.rights.other | The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission. | en |
| usyd.faculty | SeS faculties schools::Faculty of Engineering::School of Computer Science | en |
| usyd.degree | Master of Philosophy M.Phil | en |
| usyd.awardinginst | The University of Sydney | en |
| usyd.advisor | Kim, Jinman | |
| usyd.include.pub | No | en |
Associated file/s
Associated collections