Please use this identifier to cite or link to this item:
https://hdl.handle.net/2445/223909Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.advisor | Statuto, Nahuel | - |
| dc.contributor.author | Mantilla Carreño, Juan Pablo | - |
| dc.date.accessioned | 2025-10-28T10:38:00Z | - |
| dc.date.available | 2025-10-28T10:38:00Z | - |
| dc.date.issued | 2025-06-10 | - |
| dc.identifier.uri | https://hdl.handle.net/2445/223909 | - |
| dc.description | Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2025, Director: Nahuel Statuto | ca |
| dc.description.abstract | The increasing use of machine learning poses significant privacy risks, especially when sensitive data is used, and conventional anonymization methods have proven insufficient. Differential privacy is a rigorous framework for data privacy providing strong mathematical guarantees. The possibility of applying this framework to machine learning solves the privacy problem. We will present the fundamental basis of these concepts to empirically investigate, implement, and analyse two techniques for integrating differential privacy into machine learning pipelines. The first technique, dataset perturbation, involves adding calibrated Gaussian noise directly to the training data and then using any standard machine learning pipeline. The second, gradient perturbation, centers on differentially private stochastic gradient descent, is an approach that injects noise into the gradients during the training phase. For the comparative study, we developed a multi-class classification architecture using a real-world, sensitive medical dataset derived from the MIMIC-IV database. Model performance was evaluated against a non-private baseline, using the appropriate metrics considering our class imbalance, such as Macro F1-score and Macro OVO AUC. The results confirm the trade-off between privacy and utility in the models developed, where higher privacy guarantees consistently result in reduced model utility. For the specific context of this study, gradient perturbation provided a slightly more advantageous model in overall balance of utility and privacy. Ultimately, the thesis provides strong evidence for the feasibility of training useful and formally private machine learning models on real-world medical data, successfully demonstrating a practical "sweet spot" between privacy and performance can be found. | ca |
| dc.format.extent | 49 p. | - |
| dc.format.mimetype | application/pdf | - |
| dc.language.iso | eng | ca |
| dc.rights | memòria: cc-nc-nd (c) Juan Pablo Mantilla Carreño, 2025 | - |
| dc.rights | codi: GPL (c) Juan Pablo Mantilla Carreño, 2025 | - |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ | - |
| dc.rights.uri | http://www.gnu.org/licenses/gpl-3.0.ca.html | * |
| dc.source | Treballs Finals de Grau (TFG) - Enginyeria Informàtica | - |
| dc.subject.classification | Aprenentatge automàtic | ca |
| dc.subject.classification | Protecció de dades | ca |
| dc.subject.classification | Dades massives | ca |
| dc.subject.classification | Programari | ca |
| dc.subject.classification | Treballs de fi de grau | ca |
| dc.subject.classification | Processos gaussians | ca |
| dc.subject.other | Machine learning | en |
| dc.subject.other | Data protection | en |
| dc.subject.other | Big data | en |
| dc.subject.other | Computer software | en |
| dc.subject.other | Bachelor's theses | en |
| dc.subject.other | Gaussian processes | en |
| dc.title | Differentially Private Machine Learning: Implementation and Analysis of Gradient and Dataset Perturbation Techniques | ca |
| dc.type | info:eu-repo/semantics/bachelorThesis | ca |
| dc.rights.accessRights | info:eu-repo/semantics/openAccess | ca |
| Appears in Collections: | Treballs Finals de Grau (TFG) - Enginyeria Informàtica Treballs Finals de Grau (TFG) - Matemàtiques Programari - Treballs de l'alumnat | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| tfg_Mantilla_Carreño_Juan_Pablo.pdf | Memòria | 1.37 MB | Adobe PDF | View/Open |
| codi.zip | Codi font | 38.56 kB | zip | View/Open |
This item is licensed under a
Creative Commons License
