Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/213366
Title: Selection of predictors for peripheral arterial disease using tree-based algorithms
Author: Gonçalves, Margarida
Director/Tutor: Casacuberta, Carles
Keywords: Aprenentatge automàtic
Malalties arterials
Treballs de fi de màster
Machine learning
Arteries Diseases
Master's thesis
Issue Date: 30-Jun-2023
Abstract: The purpose of this thesis is to collaborate with clinicians in order to enhance knowledge of peripheral arterial disease (PAD) by leveraging machine learning techniques to select variables sharing the strongest association with PAD among a set of predictors from a recent cross-sectional medical study carried out in Barcelona (Gonçalves-Martins et al., 2021). We built several machine learning models using Random Forest, Gradient Boosting Tree, and Extreme Gradient Boost classifiers to retrieve risk factors, of which Random Forest was the most efficient. Risk factors were obtained using the Shapley Additive Explanations’ (SHAP) library. Results were compared with the known outcome of the logistic regression model used in Gonçalves-Martins et al., 2021. We were able to replicate the main results of this study, as well as to discover new nuances of the factors that play a role in the development of PAD. Consistently with the above-mentioned study, the smoking habit was found to be a strong predictor for PAD both in women and in men, whereas hypertension was found to be a strong predictor for PAD in women, whereas diabetes was found to be a strong predictor for PAD in men. Surprisingly, dyslipidemia appeared to be negatively correlated with PAD. Furthermore, cholesterol levels and blood pressure levels could be unreliable for an analysis of risk factors for PAD, due to the effect of medication. Among our findings, we discovered that REGICOR scores are most consistent when their continuous value is used, and that history of cardiovascular events is especially influential on PAD in men. In addition, abdominal perimeter proved to be more efficient in general, but especially for women, in the prediction of PAD and discernment of its risk factors for PAD than body mass index and obesity.
Note: Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2022-2023. Tutor: Carles Casacuberta
URI: http://hdl.handle.net/2445/213366
Appears in Collections:Màster Oficial - Fonaments de la Ciència de Dades

Files in This Item:
File Description SizeFormat 
tfm_margarida_goncalves.pdfMemòria1.89 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons