Carregant...
Miniatura

Tipus de document

Treball de fi de grau

Data de publicació

Llicència de publicació

cc-by-nc-nd (c) Caterina Fuses i Kuzmina, 2024
Si us plau utilitzeu sempre aquest identificador per citar o enllaçar aquest document: https://hdl.handle.net/2445/213150

Exploring machine learning approaches for phenotype prediction of Huntington's disease

Títol de la revista

ISSN de la revista

Títol del volum

Recurs relacionat

Resum

Huntington’s disease onset of symptoms is clinically predicted primarily using the length of the CAG trinucleotide expansion in the HTT gene. However, this prediction can only explain around 50% of the variability of the phenotype. It is estimated that 40% of the remaining variability is heritable, suggesting the presence of other genetic factors. Genome Wide Association Studies (GWAS) have identified potential genetic modifiers, although only through the revelation of linear effects and via computationally demanding processes. This project benchmarks various machine learning algorithms trained with an Enroll- HD GWAS dataset to predict the age at HD onset. The dataset comprises the genotype of millions of SNPs from approximately 9,000 individuals. The models considered include regularized linear models (Lasso and Elastic Net) and tree-based models (Random Forest and XGBoost), and their predictive power is compared to an Ordinary Least Squares baseline model trained solely with sex and CAG as covariates. The results indicate that tree-based models achieve the best estimation of age of onset (AO), improving the prediction by 3% with respect to the baseline, possibly due to their implicit consideration of interactions between SNPs. For each model, we extract the most significant features contributing to the model, thereby identifying genetic modifiers. Some of these key SNPs are in well-known AO modifier candidates such as FAN1 and MYT1L, while others are in genes like CDYL2 proposed as new candidates.

Descripció

Treballs Finals de Grau d'Enginyeria Biomèdica. Facultat de Medicina i Ciències de la Salut. Universitat de Barcelona. Curs: 2023-2024. Tutor/Director: Josep Maria Canals Coll ; Director: Jordi Abante Llenas

Citació

Citació

FUSES I KUZMINA, Caterina. Exploring machine learning approaches for phenotype prediction of Huntington's disease. [consulta: 25 de gener de 2026]. [Disponible a: https://hdl.handle.net/2445/213150]

Exportar metadades

JSON - METS

Compartir registre