New methods for understanding social media data for online health information
Access status:
USyd Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Naseem, UsmanAbstract
This thesis aims to address critical gaps in classifying online user-generated content, encompassing
both textual and visual elements, to understand the attitudes conveyed by individuals and categorize
information based on its intent. Social media platforms have transformed how ...
See moreThis thesis aims to address critical gaps in classifying online user-generated content, encompassing both textual and visual elements, to understand the attitudes conveyed by individuals and categorize information based on its intent. Social media platforms have transformed how people interact and share information, including health-related discussions. However, the existing methodologies for gathering health data, such as surveys and documentation, are hindered by manual data collection and analysis, leaving room for bias and incomplete information capture from the vast array of user generated content. This thesis presents a comprehensive approach that leverages supervised learning techniques to analyze user-generated content effectively. This thesis aims to demonstrate how these techniques surpass traditional methods in comprehensively processing large datasets, capturing contextual and semantic information, and addressing the inherent ordinal nature of social media data. The objectives of this thesis are to (i) Develop and evaluate methods for encoding domain-specific knowledge into language models and recurrent neural networks tailored for public health surveillance on social media, (ii) propose innovative strategies to incorporate contextual and semantic knowledge into health mention classification tasks, including identifying target keywords and user behaviour patterns, (iii) introduce novel approaches to represent ordinal input data for identifying depression and suicide, considering inter-class relationships and multi-level information, and (iv) present methods for improving the representation of multimodal data (text and images) to identify vaccine critical content and user intent, including global and local content representations. Empirical evaluation shows that proposed methods outperform the latest techniques on benchmark datasets, proving their efficacy for understanding social media data for online health information.
See less
See moreThis thesis aims to address critical gaps in classifying online user-generated content, encompassing both textual and visual elements, to understand the attitudes conveyed by individuals and categorize information based on its intent. Social media platforms have transformed how people interact and share information, including health-related discussions. However, the existing methodologies for gathering health data, such as surveys and documentation, are hindered by manual data collection and analysis, leaving room for bias and incomplete information capture from the vast array of user generated content. This thesis presents a comprehensive approach that leverages supervised learning techniques to analyze user-generated content effectively. This thesis aims to demonstrate how these techniques surpass traditional methods in comprehensively processing large datasets, capturing contextual and semantic information, and addressing the inherent ordinal nature of social media data. The objectives of this thesis are to (i) Develop and evaluate methods for encoding domain-specific knowledge into language models and recurrent neural networks tailored for public health surveillance on social media, (ii) propose innovative strategies to incorporate contextual and semantic knowledge into health mention classification tasks, including identifying target keywords and user behaviour patterns, (iii) introduce novel approaches to represent ordinal input data for identifying depression and suicide, considering inter-class relationships and multi-level information, and (iv) present methods for improving the representation of multimodal data (text and images) to identify vaccine critical content and user intent, including global and local content representations. Empirical evaluation shows that proposed methods outperform the latest techniques on benchmark datasets, proving their efficacy for understanding social media data for online health information.
See less
Date
2023Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering, School of Civil EngineeringAwarding institution
The University of SydneyShare