Strategies to Ensure Intersectional Fairness in Vision-Language Models for Clinical Decision Support

Zhang, Yupeng

Permalink

Access status:

Open Access

Type

Thesis

Thesis type

Masters by Research

Author/s

Zhang, Yupeng

Abstract

Rapid integration of artificial intelligence (AI), particularly Vision-Language Models (VLMs), as decision support system for medical diagnosis promises to enhance healthcare outcomes. However, these models can inherit and amplify societal biases, leading to significant performance ...
See moreRapid integration of artificial intelligence (AI), particularly Vision-Language Models (VLMs), as decision support system for medical diagnosis promises to enhance healthcare outcomes. However, these models can inherit and amplify societal biases, leading to significant performance disparities across diverse patient subgroups. This thesis addresses a critical and often overlooked challenge: intersectional fairness, where compounded disadvantages emerge for individuals with multiple demographic attributes (e.g., by race and gender). Existing fairness interventions, which typically focus on single demographic attributes, often fail to mitigate these compounded biases and can inadvertently degrade overall model performance or mask subtle but clinically significant disparities in diagnostic certainty. This thesis introduces a novel regularisation framework, Cross-Modal Alignment Consistency Maximum Mean Discrepancy (CMAC-MMD), to specifically address intersectional fairness at the decision level of models' architecture. This approach represents a conceptual shift from image and text feature-level manipulation to directly equalizing the model's diagnostic confidence across all intersectional subgroups. By defining a scalar "cross-modal alignment score" that serves as a proxy for the model's certainty, the CMAC-MMD method leverages a unique fairness loss to align the statistical distributions of these scores. This process compels the model to produce predictions with equitable confidence and decisiveness for all patient subgroups, regardless of their demographic profile, without requiring sensitive data during inference time. The effectiveness of the proposed framework is comprehensively evaluated through benchmarking on dermatology and ophthalmology datasets for disease classification. The results demonstrate that CMAC-MMD reduces intersectional performance disparities across multiple fairness metrics while maintaining overall diagnostic accuracy as baseline models.
See less

Date

2026

Rights statement

The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.

Faculty/School

Faculty of Engineering, School of Computer Science

Awarding institution

The University of Sydney

Subjects

Intersectional fairness
vision-language models
algorithmic fairness
medical image classification
bias mitigation
Maximum Mean Discrepancy
trustworthy AI