Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/200102
Title: Applying AutoML techniques in drug discovery: systematic modelling of antimicrobial drug activity on a wide spectrum of pathogens
Author: Torre García, Marcos de la
Director/Tutor: Duran Frigola, Miquel
Vitrià i Marca, Jordi
Keywords: Teoria de la predicció
Disseny de medicaments
Bacteris patògens
Treballs de fi de màster
Aprenentatge automàtic
Prediction theory
Drug design
Pathogenic bacteria
Master's theses
Machine learning
Issue Date: 15-Jan-2023
Abstract: [en] Predictive modelling of antimicrobial activity of molecules is a crucial step towards the discovery of anti-infective medicines. Unfortunately, there is a shortage of models covering endemic pathogens of the Global South, reflecting the existing bias in research towards diseases prevalent in wealthy countries. This project has developed a pipeline to systematically build drug discovery models, in particular antimicrobial activity prediction models for small molecule compounds. The data of assay results on a selected pathogen is extracted from a publicly available database: ChEMBL. This data is then cleaned and processed in order to build predictive models with various Automated Machine Learning (AutoML) techniques using the ZairaChem tool from the Ersilia Open Data Initiative. The pipeline has been applied on 6 pathogens of great relevance to global health known as ESKAPE, for which the data has been obtained and processed, and baseline models created. We have built the full set of final models for one of these pathogens, Staphylococcus aureus. The pipeline can be used on any other pathogen for which ChEMBL has sufficient data. This pipeline will be used to deploy models in the Ersilia Model Hub, a repository of pre-trained ML for drug discovery in global health. This will be an opportunity to compensate for the shortage of ML models adapted to the needs of the Global South.
Note: Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2022-2023. Tutor: Miquel Duran Frigola i Jordi Vitrià i Marca
URI: http://hdl.handle.net/2445/200102
Appears in Collections:Programari - Treballs de l'alumnat
Màster Oficial - Fonaments de la Ciència de Dades

Files in This Item:
File Description SizeFormat 
tfm_torre_garcia_marcos_de_la.pdfMemòria3.56 MBAdobe PDFView/Open
antimicrobial-ml-tasks-main.zipCodi font15.44 MBzipView/Open


This item is licensed under a Creative Commons License Creative Commons