Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/213150
Title: Exploring machine learning approaches for phenotype prediction of Huntington's disease
Author: Fuses i Kuzmina, Caterina
Director/Tutor: Canals Coll, Josep Maria
Keywords: Enginyeria biomèdica
Materials biomèdics
Treballs de fi de grau
Biomedical engineering
Biomedical materials
Bachelor's theses
Issue Date: 5-May-2024
Abstract: Huntington’s disease onset of symptoms is clinically predicted primarily using the length of the CAG trinucleotide expansion in the HTT gene. However, this prediction can only explain around 50% of the variability of the phenotype. It is estimated that 40% of the remaining variability is heritable, suggesting the presence of other genetic factors. Genome Wide Association Studies (GWAS) have identified potential genetic modifiers, although only through the revelation of linear effects and via computationally demanding processes. This project benchmarks various machine learning algorithms trained with an Enroll- HD GWAS dataset to predict the age at HD onset. The dataset comprises the genotype of millions of SNPs from approximately 9,000 individuals. The models considered include regularized linear models (Lasso and Elastic Net) and tree-based models (Random Forest and XGBoost), and their predictive power is compared to an Ordinary Least Squares baseline model trained solely with sex and CAG as covariates. The results indicate that tree-based models achieve the best estimation of age of onset (AO), improving the prediction by 3% with respect to the baseline, possibly due to their implicit consideration of interactions between SNPs. For each model, we extract the most significant features contributing to the model, thereby identifying genetic modifiers. Some of these key SNPs are in well-known AO modifier candidates such as FAN1 and MYT1L, while others are in genes like CDYL2 proposed as new candidates.
Note: Treballs Finals de Grau d'Enginyeria Biomèdica. Facultat de Medicina i Ciències de la Salut. Universitat de Barcelona. Curs: 2023-2024. Tutor/Director: Josep Maria Canals Coll ; Director: Jordi Abante Llenas
URI: http://hdl.handle.net/2445/213150
Appears in Collections:Treballs Finals de Grau (TFG) - Enginyeria Biomèdica

Files in This Item:
File Description SizeFormat 
TFG_Fuses_Kuzmina_Caterina.pdf5.02 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons