Deep Feature Learning for Multi-Modality Dental Image Analysis
Access status:
USyd Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Huang, ZimoAbstract
Dental medical imaging underpins modern oral healthcare by supporting diagnosis, treatment planning, and clinical education, thereby reducing disease progression and long-term healthcare burdens. X-ray imaging remains the primary first-line modality due to its accessibility and ...
See moreDental medical imaging underpins modern oral healthcare by supporting diagnosis, treatment planning, and clinical education, thereby reducing disease progression and long-term healthcare burdens. X-ray imaging remains the primary first-line modality due to its accessibility and efficiency. Two-dimensional radiographs, including intra-oral and panoramic imaging, are widely used for detecting caries, periapical pathology, and jaw abnormalities, while cone beam computed tomography (CBCT) has become the preferred modality for diagnosing odontogenic lesions by providing three-dimensional visualisation of lesion morphology. Histopathological examination remains the gold standard for definitive lesion differentiation at the cellular level. Despite their importance, dental images are challenging to interpret due to observer variability, image artefacts, inconsistent acquisition protocols, and the increasing volume and complexity of CBCT data. Existing computer-aided diagnostic (CAD) systems aim to address these challenges but are often limited by unimodal inputs, handcrafted features, limited interpretability, and poor generalisability, constrained by the scarcity of large, annotated dental datasets. This thesis proposes a unified deep learning framework for multi-modal dental image analysis to improve efficiency, interpretability, and clinical relevance. Four contributions are presented: OLS-Net, a multi-scale, auto-adaptive CBCT segmentation network for accurate lesion delineation; H2DT-Net, an interpretable, decision-tree-guided CBCT classification model aligned with clinical reasoning; A Grid-based Feature Fusion Network integrating CBCT and histopathological features; and VLM-IOR, a vision-language framework that generates correction-oriented feedback for intra-oral radiograph education. Experimental results demonstrate consistent improvements over state-of-the-art methods, establishing a foundation for intelligent and clinically relevant dental imaging systems.
See less
See moreDental medical imaging underpins modern oral healthcare by supporting diagnosis, treatment planning, and clinical education, thereby reducing disease progression and long-term healthcare burdens. X-ray imaging remains the primary first-line modality due to its accessibility and efficiency. Two-dimensional radiographs, including intra-oral and panoramic imaging, are widely used for detecting caries, periapical pathology, and jaw abnormalities, while cone beam computed tomography (CBCT) has become the preferred modality for diagnosing odontogenic lesions by providing three-dimensional visualisation of lesion morphology. Histopathological examination remains the gold standard for definitive lesion differentiation at the cellular level. Despite their importance, dental images are challenging to interpret due to observer variability, image artefacts, inconsistent acquisition protocols, and the increasing volume and complexity of CBCT data. Existing computer-aided diagnostic (CAD) systems aim to address these challenges but are often limited by unimodal inputs, handcrafted features, limited interpretability, and poor generalisability, constrained by the scarcity of large, annotated dental datasets. This thesis proposes a unified deep learning framework for multi-modal dental image analysis to improve efficiency, interpretability, and clinical relevance. Four contributions are presented: OLS-Net, a multi-scale, auto-adaptive CBCT segmentation network for accurate lesion delineation; H2DT-Net, an interpretable, decision-tree-guided CBCT classification model aligned with clinical reasoning; A Grid-based Feature Fusion Network integrating CBCT and histopathological features; and VLM-IOR, a vision-language framework that generates correction-oriented feedback for intra-oral radiograph education. Experimental results demonstrate consistent improvements over state-of-the-art methods, establishing a foundation for intelligent and clinically relevant dental imaging systems.
See less
Date
2025Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of EngineeringAwarding institution
The University of SydneyShare