Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/194055
Full metadata record
DC FieldValueLanguage
dc.contributor.authorAsensio Puig, Laura-
dc.contributor.authorAlemany, Laia-
dc.contributor.authorPavón, Miquel Angel-
dc.date.accessioned2023-02-23T16:34:50Z-
dc.date.available2023-02-23T16:34:50Z-
dc.date.issued2022-06-23-
dc.identifier.issn2673-2688-
dc.identifier.urihttp://hdl.handle.net/2445/194055-
dc.description.abstractHuman Papillomavirus (HPV) is the causal agent of 5% of cancers worldwide and the main cause of cervical cancer and it is also associated with a significant percentage of oropharyngeal and anogenital cancers. More than 60% of cervical cancers are caused by HPV16 genotype, which has been classified into lineages (A, B, C, and D). Lineages are related to the progression of cervical cancer and the current method to assess lineages is by building a Maximum Likelihood Tree (MLT); which is slow, it cannot assess poor sequenced samples, and annotation is done manually. In this study, we have developed a new model to assess HPV16 lineage using machine learning tools. A total of 645 HPV16 genomes were analyzed using Genome-Wide Association Study (GWAS), which identified 56 lineage-specific Single Nucleotide Polymorphisms (SNPs). From the SNPs found, training-test models were constructed using different algorithms such as Random Forest (RF), Support Vector Machine (SVM), and K-nearest neighbor (KNN). A distinct set of HPV16 sequences (n = 1,028), whose lineage was previously determined by MLT, was used for validation. The RF-based model allowed a precise assignment of HPV16 lineage, showing an accuracy of 99.5% in the known lineage samples. Moreover, the RF model could assess lineage to 273 samples that MLT could not determine. In terms of computer consuming time, the RF-based model was almost 40 times faster than MLT. Having a fast and efficient method for assigning HPV16 lineages, could facilitate the implementation of lineage classification as a triage or prognostic marker in the clinical setting.-
dc.format.extent8 p.-
dc.format.mimetypeapplication/pdf-
dc.language.isoeng-
dc.publisherFrontiers Media SA-
dc.relation.isformatofReproducció del document publicat a: https://doi.org/10.3389/frai.2022.851841-
dc.relation.ispartofFrontiers in Artificial Intelligence, 2022, vol. 5, p. 851841-
dc.relation.urihttps://doi.org/10.3389/frai.2022.851841-
dc.rightscc by (c) Asensio Puig, Laura et al., 2022-
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/es/*
dc.sourceArticles publicats en revistes (Institut d'lnvestigació Biomèdica de Bellvitge (IDIBELL))-
dc.subject.classificationPapil·lomavirus-
dc.subject.classificationCàncer de coll uterí-
dc.subject.classificationPronòstic mèdic-
dc.subject.classificationAprenentatge automàtic-
dc.subject.otherPapillomaviruses-
dc.subject.otherCervix cancer-
dc.subject.otherPrognosis-
dc.subject.otherMachine learning-
dc.titleA Straightforward HPV16 Lineage Classification Based on Machine Learning-
dc.typeinfo:eu-repo/semantics/article-
dc.typeinfo:eu-repo/semantics/publishedVersion-
dc.date.updated2023-02-23T13:59:39Z-
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess-
Appears in Collections:Articles publicats en revistes (Institut d'lnvestigació Biomèdica de Bellvitge (IDIBELL))

Files in This Item:
File Description SizeFormat 
frai-05-851841.pdf801.44 kBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons