Evaluating Tool-Augmented ReAct Language Agents

Eguzkitza Zalakain, Jokin

Please use this identifier to cite or link to this item: https://hdl.handle.net/2445/223176

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Igual Muñoz, Laura	-
dc.contributor.author	Eguzkitza Zalakain, Jokin	-
dc.date.accessioned	2025-09-16T08:01:34Z	-
dc.date.available	2025-09-16T08:01:34Z	-
dc.date.issued	2025-06-30	-
dc.identifier.uri	https://hdl.handle.net/2445/223176	-
dc.description	Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Any: 2025. Tutor: Laura Igual Muñoz i Pablo Álvarez	ca
dc.description.abstract	This thesis studies how to evaluate ReAct agents that use external tools. ReAct agents are AI Agents that combine reasoning and tool use (functions), allowing large language models to perform tasks that require accessing external sources of information. These agents are becoming more common in real applications, but evaluating their behaviour remains a challenge. Using LangGraph and LangChain three different AI agents are created using locally deployed LLM models served with Ollama. These agents use open-source tools like Wikipedia, Wikidata, Yahoo Finance and PDF readers. To evaluate them, the project combines rule-based checks with RAGAS metrics to measure tool use, answer quality, factual correctness and context use. The results show that prompt design is very important to guide the agent’s behaviour, and that typical question-answer metrics are not always enough to measure how well an agent works. This work offers a simple and practical way to test LLM agents. All the corresponding code notebook can be found on the following repository, https://github.com/Jokinn9/Evaluating-Tool-Augmented-ReAct-Language-Agents	ca
dc.format.extent	37 p.	-
dc.format.mimetype	application/pdf	-
dc.language.iso	eng	ca
dc.rights	cc-by-nc-nd (c) Jokin Eguzkitza Zalakain, 2025	-
dc.rights	codi: GPL (c) Jokin Eguzkitza Zalakain, 2025	-
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/es/	*
dc.rights.uri	http://www.gnu.org/licenses/gpl-3.0.ca.html	*
dc.source	Màster Oficial - Fonaments de la Ciència de Dades	-
dc.subject.classification	Tractament del llenguatge natural (Informàtica)	-
dc.subject.classification	Intel·ligència artificial	-
dc.subject.classification	Agents intel·ligents (Programari)	-
dc.subject.classification	Treballs de fi de màster	-
dc.subject.other	Natural language processing (Computer science)	-
dc.subject.other	Artificial intelligence	-
dc.subject.other	Intelligent agents (Computer software)	-
dc.subject.other	Master's thesis	-
dc.title	Evaluating Tool-Augmented ReAct Language Agents	ca
dc.type	info:eu-repo/semantics/masterThesis	ca
dc.rights.accessRights	info:eu-repo/semantics/openAccess	ca
Appears in Collections:	Màster Oficial - Fonaments de la Ciència de Dades Programari - Treballs de l'alumnat

Files in This Item:

File	Description	Size	Format
TFM_Eguzkitza_Zalakain_Jokin.zip	Codi font	40.29 MB	zip	View/Open
TFM report.pdf	Memòria	9.61 MB	Adobe PDF	View/Open

Show simple item record

This item is licensed under a Creative Commons License