Please use this identifier to cite or link to this item:
http://hdl.handle.net/2445/212903
Title: | Assessing VTE risk in cancer patients using deep learning synthetic data generation and domain adaptation techniques |
Author: | Bech Sala, Sergi |
Director/Tutor: | Pujol Vila, Oriol Lobato, Barbara |
Keywords: | Tromboembolisme Aprenentatge automàtic Malalts de càncer Treballs de fi de màster Thromboembolism Machine learning Cancer patients Master's thesis |
Issue Date: | 30-Jun-2023 |
Abstract: | [en] Venous thromboembolism (VTE) poses a significant health risk for cancer patients. In this thesis, we address the challenge of predicting VTE occurrence in cancer patients by employing state-of-the-art methods and exploring the potential of deep learning and synthetic data generation techniques. We present a Python implementation of the current state-of-the-art method for VTE prediction in cancer patients. This serves as a benchmark for our subsequent investigations. Building upon this foundation, the investigation focuses into the application of deep learning synthetic data generation methods to assess the risk of future treatments and medication for preventing VTE in cancer patients. Utilizing a small dataset comprising genetic and clinical variables, we extensively explore and compare the performance of state-of-the-art generative deep learning models specifically designed for tabular data. Notably, we adopt the CopulaGAN architecture to generate synthetic tabular data, which is subsequently utilized to train a deep learning-based classifier using domain adaptation techniques to fine-tuned the model with real data. The resulting model outperforms current state-of-the-art medical scores in accurately assessing VTE risk. Furthermore, the Precision-Recall curve derived from our model offers enhanced flexibility in selecting optimal operational points for VTE risk assessment. By combining the power of deep learning and synthetic data generation, our research contributes to the advancement of VTE risk prediction in cancer patients. The proposed methodology demonstrates promising results and paves the way for improved patient care and personalized treatment strategies. Furthermore, we introduce target encoding in the architecture of the conditional tabular generative adversarial network (CTGAN) to handle better large categorical variables. We believe that our findings have significant implications for the field of oncology and hold great potential for enhancing patient outcomes and reducing the burden of VTE in cancer care. |
Note: | Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2022-2023. Tutor: Oriol Pujol Vila i Barbara Lobato |
URI: | http://hdl.handle.net/2445/212903 |
Appears in Collections: | Programari - Treballs de l'alumnat Màster Oficial - Fonaments de la Ciència de Dades |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
tfm_bech_sala_sergi.pdf | Memòria | 4.37 MB | Adobe PDF | View/Open |
Master_thesis-main.zip | Codi font | 9.51 MB | zip | View/Open |
This item is licensed under a
Creative Commons License