Short Text Clustering with Large Language Models
Field | Value | Language |
dc.contributor.author | Miller, Justin | |
dc.date.accessioned | 2025-05-22T06:11:04Z | |
dc.date.available | 2025-05-22T06:11:04Z | |
dc.date.issued | 2025 | en_AU |
dc.identifier.uri | https://hdl.handle.net/2123/33925 | |
dc.description.abstract | This thesis addresses the challenges of clustering short text data, focusing on human interpretability and validation metrics. Employing Gaussian Mixture Models with embeddings from Large Language Models, this thesis demonstrates that these methods produce clusters that are more interpretable than traditional approaches. The thesis introduces the concept of multi-level clustering, an approach that examines how clusters form and evolve as the number of clusters in an algorithm increases. It also introduces a method to maximise the information conveyed in each cluster, while minimising the cognitive load required to understand the clusters. The findings bridge the gap between automated metrics and human evaluation, offering insights into optimal clustering techniques for short text. This is then used to examine human identity in Twitter bios and create visualisations that provide a better understanding of clusters, as well as employing linguistic methodology to identify key distinctions between the clusters. | en_AU |
dc.language.iso | en | en_AU |
dc.subject | Clustering | en_AU |
dc.subject | Large Language Models | en_AU |
dc.subject | Short Text | en_AU |
dc.title | Short Text Clustering with Large Language Models | en_AU |
dc.type | Thesis | |
dc.type.thesis | Doctor of Philosophy | en_AU |
dc.rights.other | The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission. | en_AU |
usyd.faculty | SeS faculties schools::Faculty of Science::School of Physics | en_AU |
usyd.department | Physics | en_AU |
usyd.degree | Doctor of Philosophy Ph.D. | en_AU |
usyd.awardinginst | The University of Sydney | en_AU |
usyd.advisor | Alexander, Tristram |
Associated file/s
Associated collections