Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/212903
Title: Assessing VTE risk in cancer patients using deep learning synthetic data generation and domain adaptation techniques
Author: Bech Sala, Sergi
Director/Tutor: Pujol Vila, Oriol
Lobato, Barbara
Keywords: Tromboembolisme
Aprenentatge automàtic
Malalts de càncer
Treballs de fi de màster
Thromboembolism
Machine learning
Cancer patients
Master's thesis
Issue Date: 30-Jun-2023
Abstract: [en] Venous thromboembolism (VTE) poses a significant health risk for cancer patients. In this thesis, we address the challenge of predicting VTE occurrence in cancer patients by employing state-of-the-art methods and exploring the potential of deep learning and synthetic data generation techniques. We present a Python implementation of the current state-of-the-art method for VTE prediction in cancer patients. This serves as a benchmark for our subsequent investigations. Building upon this foundation, the investigation focuses into the application of deep learning synthetic data generation methods to assess the risk of future treatments and medication for preventing VTE in cancer patients. Utilizing a small dataset comprising genetic and clinical variables, we extensively explore and compare the performance of state-of-the-art generative deep learning models specifically designed for tabular data. Notably, we adopt the CopulaGAN architecture to generate synthetic tabular data, which is subsequently utilized to train a deep learning-based classifier using domain adaptation techniques to fine-tuned the model with real data. The resulting model outperforms current state-of-the-art medical scores in accurately assessing VTE risk. Furthermore, the Precision-Recall curve derived from our model offers enhanced flexibility in selecting optimal operational points for VTE risk assessment. By combining the power of deep learning and synthetic data generation, our research contributes to the advancement of VTE risk prediction in cancer patients. The proposed methodology demonstrates promising results and paves the way for improved patient care and personalized treatment strategies. Furthermore, we introduce target encoding in the architecture of the conditional tabular generative adversarial network (CTGAN) to handle better large categorical variables. We believe that our findings have significant implications for the field of oncology and hold great potential for enhancing patient outcomes and reducing the burden of VTE in cancer care.
Note: Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2022-2023. Tutor: Oriol Pujol Vila i Barbara Lobato
URI: http://hdl.handle.net/2445/212903
Appears in Collections:Programari - Treballs de l'alumnat
Màster Oficial - Fonaments de la Ciència de Dades

Files in This Item:
File Description SizeFormat 
tfm_bech_sala_sergi.pdfMemòria4.37 MBAdobe PDFView/Open
Master_thesis-main.zipCodi font9.51 MBzipView/Open


This item is licensed under a Creative Commons License Creative Commons