Please use this identifier to cite or link to this item: https://hdl.handle.net/2445/223176
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorIgual Muñoz, Laura-
dc.contributor.authorEguzkitza Zalakain, Jokin-
dc.date.accessioned2025-09-16T08:01:34Z-
dc.date.available2025-09-16T08:01:34Z-
dc.date.issued2025-06-30-
dc.identifier.urihttps://hdl.handle.net/2445/223176-
dc.descriptionTreballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Any: 2025. Tutor: Laura Igual Muñoz i Pablo Álvarezca
dc.description.abstractThis thesis studies how to evaluate ReAct agents that use external tools. ReAct agents are AI Agents that combine reasoning and tool use (functions), allowing large language models to perform tasks that require accessing external sources of information. These agents are becoming more common in real applications, but evaluating their behaviour remains a challenge. Using LangGraph and LangChain three different AI agents are created using locally deployed LLM models served with Ollama. These agents use open-source tools like Wikipedia, Wikidata, Yahoo Finance and PDF readers. To evaluate them, the project combines rule-based checks with RAGAS metrics to measure tool use, answer quality, factual correctness and context use. The results show that prompt design is very important to guide the agent’s behaviour, and that typical question-answer metrics are not always enough to measure how well an agent works. This work offers a simple and practical way to test LLM agents. All the corresponding code notebook can be found on the following repository, https://github.com/Jokinn9/Evaluating-Tool-Augmented-ReAct-Language-Agentsca
dc.format.extent37 p.-
dc.format.mimetypeapplication/pdf-
dc.language.isoengca
dc.rightscc-by-nc-nd (c) Jokin Eguzkitza Zalakain, 2025-
dc.rightscodi: GPL (c) Jokin Eguzkitza Zalakain, 2025-
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.rights.urihttp://www.gnu.org/licenses/gpl-3.0.ca.html*
dc.sourceMàster Oficial - Fonaments de la Ciència de Dades-
dc.subject.classificationTractament del llenguatge natural (Informàtica)-
dc.subject.classificationIntel·ligència artificial-
dc.subject.classificationAgents intel·ligents (Programari)-
dc.subject.classificationTreballs de fi de màster-
dc.subject.otherNatural language processing (Computer science)-
dc.subject.otherArtificial intelligence-
dc.subject.otherIntelligent agents (Computer software)-
dc.subject.otherMaster's thesis-
dc.titleEvaluating Tool-Augmented ReAct Language Agentsca
dc.typeinfo:eu-repo/semantics/masterThesisca
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca
Appears in Collections:Màster Oficial - Fonaments de la Ciència de Dades
Programari - Treballs de l'alumnat

Files in This Item:
File Description SizeFormat 
TFM_Eguzkitza_Zalakain_Jokin.zipCodi font40.29 MBzipView/Open
TFM report.pdfMemòria9.61 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons