Please use this identifier to cite or link to this item: https://hdl.handle.net/2445/223844
Title: An Interactive LLM-based Conversational Agent for Complex Data Analysis
Author: Jurado González, Rubén
Director/Tutor: Puig Puig, Anna
Rodríguez Santiago, Inmaculada
Keywords: Bots (Programes d'ordinador)
Tractament del llenguatge natural (Informàtica)
Visualització de la informació
Agents intel·ligents (Programari)
Programari
Treballs de fi de grau
Sistemes informàtics interactius
Internet bots (Computer software)
Natural language processing (Computer science)
Information visualization
Intelligent agents (Computer software)
Computer software
Bachelor's theses
Interactive computer systems
Issue Date: 4-Jul-2025
Abstract: Complex multivariate datasets—characterized by complex parent-child structures and rich attributes such as hierarchies and networks—pose challenges for intuitive exploration and analysis. This work presents an interactive visualization system integrated with a conversational agent (chatbot) to support natural language interaction with such data, especially for domain experts. Users can upload datasets, issue natural language queries, manipulate interface elements (e.g., buttons, panels), and generate custom visualizations including force-directed graphs, circle-packing layouts, and tabular charts. These features enhance data interpretability and engagement. The system includes a robust NLP pipeline based on DistilBERT for intent classification, optimized through data balancing and retraining. Visualizations, rendered in real time with Plotly and D3.js in a Dash interface, support interactions such as zooming, panning, node selection, and dynamic color mapping via language commands. A Retrieval-Augmented Generation (RAG) pipeline enriches chatbot responses using contextual information from uploaded documents. The system also supports misclassification reporting to iteratively refine the NLP model. It handles large-scale hierarchical data efficiently and has been validated on examples like organizational charts and threaded discussions. Notable features include real-time GUI customization, multi-turn conversational support, and popup visualizations from selected data subsets using intuitive queries (e.g., “Show toxicity distribution”). User testing showed high satisfaction among experts, while novices noted a steeper learning curve during onboarding.
Note: Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2025, Director: Anna Puig Puig i Inmaculada Rodríguez Santiago
URI: https://hdl.handle.net/2445/223844
Appears in Collections:Treballs Finals de Grau (TFG) - Enginyeria Informàtica
Programari - Treballs de l'alumnat

Files in This Item:
File Description SizeFormat 
tfg_Jurado_González_Rubén.pdfMemòria8.7 MBAdobe PDFView/Open    Request a copy
codi.zipCodi font431.66 kBzipView/Open    Request a copy


Embargat   Document embargat fins el 23-10-2026


This item is licensed under a Creative Commons License Creative Commons