A Domain Adaptation Framework for Harmonized Representation Learning in Medical Datasets
| dc.contributor.advisor | Pujol Vila, Oriol | |
| dc.contributor.advisor | Lobato Delgado, Bárbara | |
| dc.contributor.author | Vara Mira, Alejandro | |
| dc.date.accessioned | 2026-04-01T14:42:05Z | |
| dc.date.available | 2026-04-01T14:42:05Z | |
| dc.date.issued | 2026-01-17 | |
| dc.description | Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Any: 2026. Tutor: Oriol Pujol Vila i Bárbara Lobato Delgado | |
| dc.description.abstract | This Master’s Thesis addresses the critical challenge of clinical data fragmentation and the prohibitive costs of medical data acquisition by proposing a deep learning architecture for cross-dataset knowledge transfer. While the medical community possesses vast amounts of data, it remains largely trapped in isolated silos characterized by structural heterogeneity and measurement bias. To bridge these gaps, this research introduces a multi-branch neural framework that leverages a large-scale auxiliary dataset, MIMIC-III, to enrich the latent representations of smaller, specialized target datasets. The methodology centers on a dual-encoding strategy where a shared encoder extracts robust statistical patterns from common clinical attributes across populations, while independent private encoders preserve domain-specific niche variables. Empirical validation in the context of ICU mortality prediction demonstrates that this harmonized representation learning consistently improves Precision-Recall and AUC-ROC metrics. By employing a rigorous methodology upon sequential experiments, the study confirms that these performance gains are statistically significant and directly attributable to the enhanced feature representation, rather than artifacts of stochasticity or overfitting. Ultimately, this work provides a scalable blueprint for clinical data codification, proving that common attributes can serve as a functional bridge to maximize the utility of existing medical records in data-constrained environments. | |
| dc.format.extent | 25 p. | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.uri | https://hdl.handle.net/2445/228662 | |
| dc.language.iso | eng | |
| dc.rights | cc-by-nc-nd (c) Alejandro Vara Mira, 2026 | |
| dc.rights.accessRights | info:eu-repo/semantics/openAccess | |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ | |
| dc.source | Màster Oficial - Fonaments de la Ciència de Dades | |
| dc.subject.classification | Informàtica mèdica | |
| dc.subject.classification | Aprenentatge per transferència | |
| dc.subject.classification | Aprenentatge profund | |
| dc.subject.classification | Medicina basada en l'evidència | |
| dc.subject.classification | Alejandro Vara Mira | |
| dc.subject.classification | Treballs de fi de màster | |
| dc.subject.other | Medical informatics | |
| dc.subject.other | Transfer learning (Machine learning) | |
| dc.subject.other | Deep learning (Machine learning) | |
| dc.subject.other | Evidence-based medicine | |
| dc.subject.other | Master's thesis | |
| dc.title | A Domain Adaptation Framework for Harmonized Representation Learning in Medical Datasets | |
| dc.type | info:eu-repo/semantics/masterThesis |
Fitxers
Paquet original
1 - 2 de 2
Carregant...
- Nom:
- TFM_Vara_Mira_Alejandro.pdf
- Mida:
- 1.73 MB
- Format:
- Adobe Portable Document Format