Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/200102
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorDuran Frigola, Miquel-
dc.contributor.advisorVitrià i Marca, Jordi-
dc.contributor.authorTorre García, Marcos de la-
dc.date.accessioned2023-06-29T09:32:09Z-
dc.date.available2023-06-29T09:32:09Z-
dc.date.issued2023-01-15-
dc.identifier.urihttp://hdl.handle.net/2445/200102-
dc.descriptionTreballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2022-2023. Tutor: Miquel Duran Frigola i Jordi Vitrià i Marcaca
dc.description.abstract[en] Predictive modelling of antimicrobial activity of molecules is a crucial step towards the discovery of anti-infective medicines. Unfortunately, there is a shortage of models covering endemic pathogens of the Global South, reflecting the existing bias in research towards diseases prevalent in wealthy countries. This project has developed a pipeline to systematically build drug discovery models, in particular antimicrobial activity prediction models for small molecule compounds. The data of assay results on a selected pathogen is extracted from a publicly available database: ChEMBL. This data is then cleaned and processed in order to build predictive models with various Automated Machine Learning (AutoML) techniques using the ZairaChem tool from the Ersilia Open Data Initiative. The pipeline has been applied on 6 pathogens of great relevance to global health known as ESKAPE, for which the data has been obtained and processed, and baseline models created. We have built the full set of final models for one of these pathogens, Staphylococcus aureus. The pipeline can be used on any other pathogen for which ChEMBL has sufficient data. This pipeline will be used to deploy models in the Ersilia Model Hub, a repository of pre-trained ML for drug discovery in global health. This will be an opportunity to compensate for the shortage of ML models adapted to the needs of the Global South.ca
dc.format.extent41 p.-
dc.format.mimetypeapplication/pdf-
dc.language.isoengca
dc.rightscc-by-nc-nd (c) Marcos de la Torre García, 2023-
dc.rightscodi: GPL (c) Marcos de la Torre García, 2023-
dc.rights.urihttp://www.gnu.org/licenses/gpl-3.0.ca.html*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.sourceMàster Oficial - Fonaments de la Ciència de Dades-
dc.subject.classificationTeoria de la predicció-
dc.subject.classificationDisseny de medicaments-
dc.subject.classificationBacteris patògens-
dc.subject.classificationTreballs de fi de màster-
dc.subject.classificationAprenentatge automàticca
dc.subject.otherPrediction theory-
dc.subject.otherDrug design-
dc.subject.otherPathogenic bacteria-
dc.subject.otherMaster's theses-
dc.subject.otherMachine learningen
dc.titleApplying AutoML techniques in drug discovery: systematic modelling of antimicrobial drug activity on a wide spectrum of pathogensca
dc.typeinfo:eu-repo/semantics/masterThesisca
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca
Appears in Collections:Programari - Treballs de l'alumnat
Màster Oficial - Fonaments de la Ciència de Dades

Files in This Item:
File Description SizeFormat 
tfm_torre_garcia_marcos_de_la.pdfMemòria3.56 MBAdobe PDFView/Open
antimicrobial-ml-tasks-main.zipCodi font15.44 MBzipView/Open


This item is licensed under a Creative Commons License Creative Commons