Novel deep learning-based methods for improved prediction and feature-learning in high-throughput proteomic and transcriptomic data
Field | Value | Language |
dc.contributor.author | Geddes, Thomas Andrew | |
dc.date.accessioned | 2025-07-14T06:08:34Z | |
dc.date.available | 2025-07-14T06:08:34Z | |
dc.date.issued | 2025 | en_AU |
dc.identifier.uri | https://hdl.handle.net/2123/34107 | |
dc.description.abstract | The rise of high-throughput Omics technologies has allowed researchers to measure biomolecular species of interest en masse at the sample or individual cell level. These technologies, including bulk and single cell transcriptomics, mass spectrometry (MS) proteomics, and other MS techniques capable of quantifying post-translational modifications (PTMs) of proteins, produce extremely large datasets, presenting new opportunities and challenges for data analysis. These datasets may capture complex relationships in the regulation of genes, proteins and PTMs. However, the development of sophisticated techniques is required both to extract this information and to overcome pathologies and challenges that arise. Issues such as missingness, biological noise, the curse of dimensionality, and others make these datasets non-trivial to analyse. This thesis explores different approaches to analysing high-throughput datasets, extracting useful information and addressing some of the challenges involved. Chapter 2 introduces Thunderbolt, a traditional analysis pipeline which provides tools for diagnosis and remedy of pathologies inherent to specific MS proteomics datasets; differential expression analysis; and downstream analysis tools. The chapter demonstrates a full analysis workflow to address a specific hypothesis and discusses approaches to dealing with dataset pathologies. Chapter 3 introduces scCCESS, a flexible autoencoder-based framework for improving the performance of clustering methods when applied to single-cell RNA-seq datasets by diversifying and simplifying inputs to the chosen clustering algorithm. Chapter 4 introduces ConGregatE-PPI, a predictive ensemble artificial neural network model which leverages complementary information from multiple datasets to improve prediction of protein-protein interactions in a specific biological context. | en_AU |
dc.language.iso | en | en_AU |
dc.subject | omics | en_AU |
dc.subject | proteomics | en_AU |
dc.subject | transcriptomics | en_AU |
dc.subject | deep learning | en_AU |
dc.subject | bioinformatics | en_AU |
dc.title | Novel deep learning-based methods for improved prediction and feature-learning in high-throughput proteomic and transcriptomic data | en_AU |
dc.type | Thesis | |
dc.type.thesis | Doctor of Philosophy | en_AU |
dc.rights.other | The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission. | en_AU |
usyd.faculty | SeS faculties schools::Faculty of Science::School of Life and Environmental Sciences | en_AU |
usyd.degree | Doctor of Philosophy Ph.D. | en_AU |
usyd.awardinginst | The University of Sydney | en_AU |
usyd.advisor | Burchfield, James |
Associated file/s
Associated collections