A Statistical Framework For Nutriomics Data Analysis
Access status:
Open Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Xu, XiangnanAbstract
Nutriomics is a new discipline that investigates the relationship between nutrition and health through the use of high throughput omics technologies. However, the inherent complexity of nutriomics data poses several challenges for data analysis. In this thesis, the author introduces ...
See moreNutriomics is a new discipline that investigates the relationship between nutrition and health through the use of high throughput omics technologies. However, the inherent complexity of nutriomics data poses several challenges for data analysis. In this thesis, the author introduces nutriomics and the statistical challenges associated with its analysis. They propose statistical modelling and machine learning methods to tackle three main challenges: non-linearity, high dimensionality, and data heterogeneity. To deal with these challenges, we first propose a statistical framework, that we coin LC-N2G, to test whether the association between nutrition intake and omics features of interest are significantly different from being unrelated. We use public data as an example to show LC-N2G's ability to discover non-linear associations between nutrition and gene expression. Then we propose a statistical method, coined eNODAL, to cluster high-dimensional omics features based on how they respond to nutrition intake. The application of eNODAL to a mouse proteomics nutrition study shows that eNODAL can identify interpretable clusters of proteins with similar responses to diet and drug treatment. Finally, a statistical model, which we call NEMoE, is proposed to uncover the heterogeneous interplay among diet, omics, and health outcomes. We use a microbiome Parkinson’s disease (PD) study to illustrate the method and show that NEMoE is able to identify diet-specific microbial signatures of PD. Overall, this thesis proposes statistical methods to analyze nutriomics data and provides possible future extensions based on the research. The methods proposed in this thesis could help researchers better understand the complex relationships between nutrition and health, ultimately leading to improved health outcomes.
See less
See moreNutriomics is a new discipline that investigates the relationship between nutrition and health through the use of high throughput omics technologies. However, the inherent complexity of nutriomics data poses several challenges for data analysis. In this thesis, the author introduces nutriomics and the statistical challenges associated with its analysis. They propose statistical modelling and machine learning methods to tackle three main challenges: non-linearity, high dimensionality, and data heterogeneity. To deal with these challenges, we first propose a statistical framework, that we coin LC-N2G, to test whether the association between nutrition intake and omics features of interest are significantly different from being unrelated. We use public data as an example to show LC-N2G's ability to discover non-linear associations between nutrition and gene expression. Then we propose a statistical method, coined eNODAL, to cluster high-dimensional omics features based on how they respond to nutrition intake. The application of eNODAL to a mouse proteomics nutrition study shows that eNODAL can identify interpretable clusters of proteins with similar responses to diet and drug treatment. Finally, a statistical model, which we call NEMoE, is proposed to uncover the heterogeneous interplay among diet, omics, and health outcomes. We use a microbiome Parkinson’s disease (PD) study to illustrate the method and show that NEMoE is able to identify diet-specific microbial signatures of PD. Overall, this thesis proposes statistical methods to analyze nutriomics data and provides possible future extensions based on the research. The methods proposed in this thesis could help researchers better understand the complex relationships between nutrition and health, ultimately leading to improved health outcomes.
See less
Date
2023Rights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Science, School of Mathematics and StatisticsDepartment, Discipline or Centre
Mathematics and Statistics Academic OperationsAwarding institution
The University of SydneyShare