Multimodal NLP in Mental Healthcare
| Field | Value | Language |
| dc.contributor.author | Cabral, Rina Carines Manumbali | |
| dc.date.accessioned | 2026-05-22T00:04:56Z | |
| dc.date.available | 2026-05-22T00:04:56Z | |
| dc.date.issued | 2026 | en_AU |
| dc.identifier.uri | https://hdl.handle.net/2123/35336 | |
| dc.description | Includes publication | |
| dc.description.abstract | Mental health has been a growing concern for countries, communities, and individuals. Despite considerable advances, mental healthcare systems still face significant challenges, prompting researchers to explore opportunities in deep learning and natural language processing. However, recent research trends have shifted toward incorporating various media-based modalities, including videos, images, and physiological data. This shift, while promising, introduces new limitations, particularly in terms of data accessibility and research reproducibility. This thesis addresses these challenges by leveraging the ubiquity of textual data in mental health-related settings, aiming to exhaust different text-derived complementary information at different abstraction levels to enrich textual representations beyond standard semantic contextualisation. The main contributions are threefold, proposing three abstraction-level modalities and three different approaches to multimodal integration to improve mental health risk detection and information extraction. First, inspired by the complexity of human emotions and language, affective information from the emotion modality is integrated through multi-emotion graph pretraining for depression and suicide risk detection. The second study introduces the acoustic modality, capturing prosodic information derived from textual data and integrating it through a multi-teacher knowledge distillation framework, along with emotion and textual abstractions, for the same mental health tasks. Finally, the word-pair modality is explored, proposing a novel perspective on relational-structural abstraction from raw textual input, integrated through a triplet-grid framework, that improves word-boundary detection for the extraction of disjointed adverse drug reactions. | en_AU |
| dc.language.iso | en | en_AU |
| dc.subject | Multimodal AI | en_AU |
| dc.subject | Natural Language Processing | en_AU |
| dc.subject | Mental Health | en_AU |
| dc.title | Multimodal NLP in Mental Healthcare | en_AU |
| dc.type | Thesis | |
| dc.type.thesis | Doctor of Philosophy | en_AU |
| dc.rights.other | The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission. | en |
| usyd.faculty | SeS faculties schools::Faculty of Engineering::School of Computer Science | en_AU |
| usyd.degree | Doctor of Philosophy Ph.D. | en_AU |
| usyd.awardinginst | The University of Sydney | en_AU |
| usyd.advisor | Poom, Josiah | |
| usyd.include.pub | Yes | en_AU |
Associated file/s
Associated collections