Carregant...
Miniatura

Tipus de document

Treball de fi de màster

Data de publicació

Llicència de publicació

cc-by-nc-nd (c) Jokin Eguzkitza Zalakain, 2025
Si us plau utilitzeu sempre aquest identificador per citar o enllaçar aquest document: https://hdl.handle.net/2445/223176

Evaluating Tool-Augmented ReAct Language Agents

Títol de la revista

Director/Tutor

ISSN de la revista

Títol del volum

Resum

This thesis studies how to evaluate ReAct agents that use external tools. ReAct agents are AI Agents that combine reasoning and tool use (functions), allowing large language models to perform tasks that require accessing external sources of information. These agents are becoming more common in real applications, but evaluating their behaviour remains a challenge. Using LangGraph and LangChain three different AI agents are created using locally deployed LLM models served with Ollama. These agents use open-source tools like Wikipedia, Wikidata, Yahoo Finance and PDF readers. To evaluate them, the project combines rule-based checks with RAGAS metrics to measure tool use, answer quality, factual correctness and context use. The results show that prompt design is very important to guide the agent’s behaviour, and that typical question-answer metrics are not always enough to measure how well an agent works. This work offers a simple and practical way to test LLM agents. All the corresponding code notebook can be found on the following repository, https://github.com/Jokinn9/Evaluating-Tool-Augmented-ReAct-Language-Agents

Descripció

Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Any: 2025. Tutor: Laura Igual Muñoz i Pablo Álvarez

Citació

Citació

EGUZKITZA ZALAKAIN, Jokin. Evaluating Tool-Augmented ReAct Language Agents. [consulta: 8 de desembre de 2025]. [Disponible a: https://hdl.handle.net/2445/223176]

Exportar metadades

JSON - METS

Compartir registre