Deep Feature Learning for Multi-Modality Dental Image Analysis

Huang, Zimo

Permalink

Access status:

USyd Access

Type

Thesis

Thesis type

Doctor of Philosophy

Author/s

Huang, Zimo

Abstract

Dental medical imaging underpins modern oral healthcare by supporting diagnosis, treatment planning, and clinical education, thereby reducing disease progression and long-term healthcare burdens. X-ray imaging remains the primary first-line modality due to its accessibility and ...
See moreDental medical imaging underpins modern oral healthcare by supporting diagnosis, treatment planning, and clinical education, thereby reducing disease progression and long-term healthcare burdens. X-ray imaging remains the primary first-line modality due to its accessibility and efficiency. Two-dimensional radiographs, including intra-oral and panoramic imaging, are widely used for detecting caries, periapical pathology, and jaw abnormalities, while cone beam computed tomography (CBCT) has become the preferred modality for diagnosing odontogenic lesions by providing three-dimensional visualisation of lesion morphology. Histopathological examination remains the gold standard for definitive lesion differentiation at the cellular level. Despite their importance, dental images are challenging to interpret due to observer variability, image artefacts, inconsistent acquisition protocols, and the increasing volume and complexity of CBCT data. Existing computer-aided diagnostic (CAD) systems aim to address these challenges but are often limited by unimodal inputs, handcrafted features, limited interpretability, and poor generalisability, constrained by the scarcity of large, annotated dental datasets. This thesis proposes a unified deep learning framework for multi-modal dental image analysis to improve efficiency, interpretability, and clinical relevance. Four contributions are presented: OLS-Net, a multi-scale, auto-adaptive CBCT segmentation network for accurate lesion delineation; H2DT-Net, an interpretable, decision-tree-guided CBCT classification model aligned with clinical reasoning; A Grid-based Feature Fusion Network integrating CBCT and histopathological features; and VLM-IOR, a vision-language framework that generates correction-oriented feedback for intra-oral radiograph education. Experimental results demonstrate consistent improvements over state-of-the-art methods, establishing a foundation for intelligent and clinically relevant dental imaging systems.
See less

Date

2025

Rights statement

The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.

Faculty/School

Faculty of Engineering

Awarding institution

The University of Sydney

Subjects

Deep Learning
Dental Medical Imaging
Multi-Modality Analysis
Computer-aided Diagnosis