Recreant la presa de decisions humana mitjançant aprenentatge per reforç

Pirla Torrell, Martı́

Please use this identifier to cite or link to this item: https://hdl.handle.net/2445/216825

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Cos Aguilera, Ignasi	-
dc.contributor.author	Pirla Torrell, Martı́	-
dc.date.accessioned	2024-11-29T07:10:10Z	-
dc.date.available	2024-11-29T07:10:10Z	-
dc.date.issued	2024-07-11	-
dc.identifier.uri	https://hdl.handle.net/2445/216825	-
dc.description	Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2024, Director: Ignasi Cos Aguilera	ca
dc.description.abstract	[ca] En aquest projecte estudiem les dades que va recollir el Michael DePass de com un conjunt de subjectes feien un exercici intentant trobar com aconseguir la màxima recompensa. Els subjectes repetien l’exercici un total de 300 vegades, en el qual havien d’escollir entre dos estı́muls en una pantalla. A través d’aquests 300 intents, els subjectes havien de descobrir quins estı́muls escollir per aconseguir la millor recompensa possible. Amb aquestes dades, interpretarem el seu comportament i aplicarem aprenentatge per reforç al mateix problema per comparar les diferències en la presa de decisions entre l’algorisme de Q-learning i els subjectes. Finalment, l’objectiu és ajustar els hiperparàmetres d’un agent de Q-learning per aconseguir que el seu comportament s’assimili al màxim al dels subjectes humans. [en] In this project, we study the data collected by Michael DePass on how a group of subjects performed a task aimed at obtaining the maximum reward. The subjects repeated the task a total of 300 times, in which they had to choose between two stimuli presented on a screen. Over these 300 attempts, the subjects needed to figure out which stimuli to select to achieve the best possible reward. With this data, we will interpret their behavior and apply reinforcement learning to the same problem to compare the differences in optimal decision-making strategies. Finally, the goal is to fit the hyperparameters of a Q-learning agent to make its behavior closely resemble that of the human subjects.	ca
dc.format.extent	50 p.	-
dc.format.mimetype	application/pdf	-
dc.language.iso	cat	ca
dc.rights	memòria: cc-nc-nd (c) Martı́ Pirla Torrell, 2024	-
dc.rights	codi: GPL (c) Martı́ Pirla Torrell, 2024	-
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/es/	-
dc.rights.uri	http://www.gnu.org/licenses/gpl-3.0.ca.html	*
dc.source	Treballs Finals de Grau (TFG) - Enginyeria Informàtica	-
dc.subject.classification	Aprenentatge per reforç (Intel·ligència artificial)	ca
dc.subject.classification	Algorismes computacionals	ca
dc.subject.classification	Aprenentatge automàtic	ca
dc.subject.classification	Programari	ca
dc.subject.classification	Treballs de fi de grau	ca
dc.subject.other	Reinforcement learning	en
dc.subject.other	Computer algorithms	en
dc.subject.other	Machine learning	en
dc.subject.other	Computer software	en
dc.subject.other	Bachelor's theses	en
dc.title	Recreant la presa de decisions humana mitjançant aprenentatge per reforç	ca
dc.type	info:eu-repo/semantics/bachelorThesis	ca
dc.rights.accessRights	info:eu-repo/semantics/openAccess	ca
Appears in Collections:	Treballs Finals de Grau (TFG) - Enginyeria Informàtica Programari - Treballs de l'alumnat

Files in This Item:

File	Description	Size	Format
tfg_pirla_torrell_marti.pdf	Memòria	3.28 MB	Adobe PDF	View/Open
codi.zip	Codi font	47.82 MB	zip	View/Open

Show simple item record

This item is licensed under a Creative Commons License