Predictive Modelling of the Comorbidity of Chronic Diseases: A Network and Machine Learning Approach
Access status:
Open Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Hossain, Md EkramulAbstract
Chronic diseases have become increasingly common and caused most of the burden of ill health in many countries. They are associated with adverse health outcomes in terms of mobility and quality of life, as well as an increased financial burden. Chronic diseases pose several health ...
See moreChronic diseases have become increasingly common and caused most of the burden of ill health in many countries. They are associated with adverse health outcomes in terms of mobility and quality of life, as well as an increased financial burden. Chronic diseases pose several health risks for patients suffering from more than one chronic disease (also known as comorbidity of chronic diseases). The prevalence of chronic disease comorbidity has increased globally. Understanding the progression of comorbidities and predicting the risks can provide valuable insights into the prevention and better management of chronic diseases. The availability of administrative datasets provides an opportunity to apply a predictive model to improve the healthcare system. Most studies in this field focus on understanding the progression of one chronic disease rather than multiple chronic diseases. Analysis of administrative data using a network approach and machine learning techniques can help predict the risk of comorbidity of chronic diseases. In this thesis, we propose a risk prediction model using administrative data that uses network-based features and machine learning techniques to assess the risk of chronic disease comorbidities. This study has two broad goals: (1) to understand and represent the progression of comorbidity of chronic diseases, and (2) to develop a risk prediction model based on the disease progression to predict the comorbidity of chronic diseases for chronic disease patients. Specifically, it focuses on the comorbidity progression of CVD in patients with T2D, as a high proportion of older adults with T2D often develop CVD. We used administrative data and network analytics to implement the first part of this study, and we used machine learning techniques for the second part. For this, two cohorts (i.e. patients with both T2D and CVD and patients with only T2D) were identified from an administrative dataset collected from private healthcare funds based in Australia. Two baseline disease networks were generated from the two study cohorts. A final disease network was then generated from two baseline disease networks through normalisation. We extracted some social network-based features (i.e. the prevalence of comorbidities, transition patterns and clustering membership) from the final disease network and some demographic characteristics directly from the dataset. These risk factors were then used to develop six machine learning prediction models (logistic regression, support vector machine, decision tree, random forest, Naïve Bayes and k-nearest neighbour) to assess the risk of CVD in patients with T2D. The results showed that the prevalence of renal failure, fluid and electrolyte disorders, hypertension, and obesity was significantly higher in patients with both CVD and T2D than in patients with only T2D. This indicated that these chronic diseases occurred frequently during the progression of CVD in patients with T2D. This study measured performance in terms of accuracy, precision, recall, F1 score and area under the curve (AUC). The model based on random forest showed the highest accuracy (87.50%) and AUC of 0.83. Overall, the accuracy of the classifiers ranged from 79% to 88%, which shows the potential of the network-based and machine learning–based risk prediction model using administrative data. The proposed model may help healthcare providers to understand high-risk chronic diseases and the progression patterns between the recurrence of multiple chronic diseases. Further, the comorbid risk prediction model could be useful for medical practice and stakeholders (including government and private health insurers) to develop health management programs for patients at high risk of developing multiple chronic diseases.
See less
See moreChronic diseases have become increasingly common and caused most of the burden of ill health in many countries. They are associated with adverse health outcomes in terms of mobility and quality of life, as well as an increased financial burden. Chronic diseases pose several health risks for patients suffering from more than one chronic disease (also known as comorbidity of chronic diseases). The prevalence of chronic disease comorbidity has increased globally. Understanding the progression of comorbidities and predicting the risks can provide valuable insights into the prevention and better management of chronic diseases. The availability of administrative datasets provides an opportunity to apply a predictive model to improve the healthcare system. Most studies in this field focus on understanding the progression of one chronic disease rather than multiple chronic diseases. Analysis of administrative data using a network approach and machine learning techniques can help predict the risk of comorbidity of chronic diseases. In this thesis, we propose a risk prediction model using administrative data that uses network-based features and machine learning techniques to assess the risk of chronic disease comorbidities. This study has two broad goals: (1) to understand and represent the progression of comorbidity of chronic diseases, and (2) to develop a risk prediction model based on the disease progression to predict the comorbidity of chronic diseases for chronic disease patients. Specifically, it focuses on the comorbidity progression of CVD in patients with T2D, as a high proportion of older adults with T2D often develop CVD. We used administrative data and network analytics to implement the first part of this study, and we used machine learning techniques for the second part. For this, two cohorts (i.e. patients with both T2D and CVD and patients with only T2D) were identified from an administrative dataset collected from private healthcare funds based in Australia. Two baseline disease networks were generated from the two study cohorts. A final disease network was then generated from two baseline disease networks through normalisation. We extracted some social network-based features (i.e. the prevalence of comorbidities, transition patterns and clustering membership) from the final disease network and some demographic characteristics directly from the dataset. These risk factors were then used to develop six machine learning prediction models (logistic regression, support vector machine, decision tree, random forest, Naïve Bayes and k-nearest neighbour) to assess the risk of CVD in patients with T2D. The results showed that the prevalence of renal failure, fluid and electrolyte disorders, hypertension, and obesity was significantly higher in patients with both CVD and T2D than in patients with only T2D. This indicated that these chronic diseases occurred frequently during the progression of CVD in patients with T2D. This study measured performance in terms of accuracy, precision, recall, F1 score and area under the curve (AUC). The model based on random forest showed the highest accuracy (87.50%) and AUC of 0.83. Overall, the accuracy of the classifiers ranged from 79% to 88%, which shows the potential of the network-based and machine learning–based risk prediction model using administrative data. The proposed model may help healthcare providers to understand high-risk chronic diseases and the progression patterns between the recurrence of multiple chronic diseases. Further, the comorbid risk prediction model could be useful for medical practice and stakeholders (including government and private health insurers) to develop health management programs for patients at high risk of developing multiple chronic diseases.
See less
Date
2020Publisher
University of SydneyRights statement
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering, School of Project ManagementAwarding institution
The University of SydneyShare