Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/214303
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorVitrià i Marca, Jordi-
dc.contributor.authorLópez Caro, Álvaro-
dc.date.accessioned2024-07-04T07:57:56Z-
dc.date.available2024-07-04T07:57:56Z-
dc.date.issued2023-06-30-
dc.identifier.urihttp://hdl.handle.net/2445/214303-
dc.descriptionTreballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2022-2023. Tutor: Jordi Vitrià i Marcaca
dc.description.abstractThis thesis endeavors to cast a spotlight on the evolution and applicability of machine translation (MT) evaluation metrics and models, mainly contrasting statistical methods against the more contemporary neural-based ones, where we also give special attention to the exciting modern Large Language Models (LLMs). MT, a significant area in Natural Language Processing (NLP), has seen a vast metamorphosis over the years, bringing into focus the critical need for thorough exploration of these evolving systems. Our research is anchored on the Digital Corpus of the European Parliament (DCEP), a complex and multilingual corpus that makes it an ideal testbed to benchmark MT models given its comprehensive and diversified linguistic data. Through the use of this extensive corpus, we aim to present a comprehensive benchmarking of various selected MT models, encapsulating not just their evolution but also their performance dynamics across different tasks and contexts. A vital facet of our study includes evaluating the relevance and reliability of various MT metrics, such as the old BLEU, METEOR, CHRF, along with newer neuralbased metrics which promise to capture semantics more effectively. We aim to uncover the inherent strengths and limitations of these metrics, consequently guiding the choice of appropriate metrics for specific MT contexts for future practitioners and researchers. In this holistic examination, we will also propose to analyze the interplay between model selection, evaluation metric, and translation quality. This thesis will provide a novel lens to understand the idiosyncrasies of various popular MT models and evaluation metrics, ultimately contributing to more effective and nuanced applications of MT. In sum, this exploration promises to furnish a new perspective on MT evaluation, honing our understanding of both the models’ and metrics’ evolutionary paths, and providing insights into their contextual performance on the DCEP corpus, creating a benchmark that can serve the broader MT community. The insights derived aim to significantly contribute to the latter. The reader can find all the code, used for the text pre/postprocessing and evaluation of the models and metrics at play along with other intermediate matters, published publicly in our GitHub repository.ca
dc.format.extent39 p.-
dc.format.mimetypeapplication/pdf-
dc.language.isoengca
dc.rightscc-by-nc-nd (c) Álvaro López Caro, 2023-
dc.rightscodi: Apache (c) Álvaro López Caro, 2023-
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.rights.urihttps://www.apache.org/licenses/LICENSE-2.0.txt*
dc.sourceMàster Oficial - Fonaments de la Ciència de Dades-
dc.subject.classificationTraducció automàtica-
dc.subject.classificationLingüística computacional-
dc.subject.classificationTractament del llenguatge natural (Informàtica)-
dc.subject.classificationTreballs de fi de màster-
dc.subject.otherMachine translating-
dc.subject.otherComputational linguistics-
dc.subject.otherNatural language processing (Computer science)-
dc.subject.otherMaster's thesis-
dc.titleMachine translation evaluation metrics benchmarking: from traditional MT to LLMsca
dc.typeinfo:eu-repo/semantics/masterThesisca
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca
Appears in Collections:Programari - Treballs de l'alumnat
Màster Oficial - Fonaments de la Ciència de Dades

Files in This Item:
File Description SizeFormat 
tfm_lopez_caro_alvaro.pdfMemòria1.32 MBAdobe PDFView/Open
Machine-Translation-evaluation-metrics-benchmarking-main.zipCodi font2.67 MBzipView/Open


This item is licensed under a Creative Commons License Creative Commons