Multi-agent System and Reinforcement Learning in Medical Report Generation

Wang, Pengyu

Permalink

Access status:

Open Access

Type

Thesis

Thesis type

Masters by Research

Author/s

Wang, Pengyu

Abstract

Medical large vision-language models have enabled automatic medical report generation, yet two challenges still limit diagnostic utility: normality bias and token-level training misalignment. Models often under-detect abnormalities and miss clinically important findings, while ...
See moreMedical large vision-language models have enabled automatic medical report generation, yet two challenges still limit diagnostic utility: normality bias and token-level training misalignment. Models often under-detect abnormalities and miss clinically important findings, while token-level imitation captures writing style rather than report-level clinical correctness. To address this, this thesis proposes two complementary approaches: (i) MRGAgents, a disease-specific multi-agent system that decomposes reporting into condition-focused subtasks for more balanced and comprehensive coverage; and (ii) MRG-R1, a fine-tuning paradigm based on semantic-driven reinforcement learning that directly optimizes report-level clinical correctness and factual alignment. MRGAgents uses specialized agents trained on curated disease-specific subsets of IU X-Ray and MIMIC-CXR, giving each agent stronger discrimination and descriptive ability for its target conditions. At inference, their outputs are aggregated to better balance normal and abnormal findings and provide more complete diagnostic descriptions. Empirically, MRGAgents improved coverage and abnormality reporting over strong baselines, reducing missed findings. MRG-R1 introduces SRL with Group Relative Policy Optimization and a margin CheXbert cosine similarity reward on key radiologic findings. This directly optimizes report-level clinical-label agreement and semantic consistency beyond surface fluency. Evaluated on IU X-Ray and MIMIC-CXR with clinical efficacy metrics, MRG-R1 achieved state-of-the-art CE-F1. Ablation studies showed that MCCS provided finer-grained supervision than CE-F1-based objectives, while an explicit reasoning-to-report process encouraged structured generation and improved diagnostic accuracy with minimal computational overhead. Overall, these architectural and training contributions improve report comprehensiveness, abnormality sensitivity, and clinical correctness for chest X-ray report generation.
See less

Date

2026

Rights statement

The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.

Faculty/School

Faculty of Engineering, School of Computer Science

Awarding institution

The University of Sydney

Subjects

Medical Report Generation
Chest X-ray
Multi-agent System
Reinforcement Learning