Díaz, OliverHernández Antón, Sergio2025-05-212025-05-212025-01-17https://hdl.handle.net/2445/221150Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Any: 2025. Tutor: Oliver DíazBreast cancer remains one of the leading causes of cancer-related morbidity and mortality worldwide, requiring robust methodologies for early risk prediction, recurrence forecasting, and survival analysis. This thesis defines a comprehensive pipeline for breast cancer risk prediction, emphasizing both technical precision and clinical relevance. The proposed framework integrates multiple components: data acquisition, preprocessing, feature extraction, model selection, interpretability, and explainability, in order to ensure accurate, transparent, and actionable outcomes. Overall, this thesis aims to advance the field of breast cancer prediction by delivering a robust, interpretable, and clinically relevant pipeline, aligning with the important goal of improving patient outcomes through early and precise detection. Additionally, in an attempt to make this thesis more reachable, we add a feature dictionary for both used datasets in Appendix A. On top of that, we also share the project in the shape of a GitHub repository, so that people can take profit of this research if at all possible. We also include a guide on its structure in Appendix B.50 p.application/pdfengcc-by-nc-nd (c) Sergio Hernández Antón, 2025codi: GPL (c) Sergio Hernández Antón, 2025http://creativecommons.org/licenses/by-nc-nd/3.0/es/http://www.gnu.org/licenses/gpl-3.0.ca.htmlCàncer de mamaMedicina preventivaAprenentatge automàticTreballs de fi de màsterBreast cancerPreventive medicineMachine learningMaster's thesisUsing clinical data for breast cancer risk prediction and follow-upinfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccess