Q-learning in collaborative multiagent systems

dc.contributor.advisorLópez Sánchez, Maite
dc.contributor.authorGonzález Trastoy, Alfred
dc.date.accessioned2018-08-02T08:53:56Z
dc.date.available2018-08-02T08:53:56Z
dc.date.issued2018-02
dc.descriptionTreballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2018, Director: Maite López Sánchezca
dc.description.abstractQ-learning is one of the most widely used reinforcement learning techniques. It is very effective for learning an optimal policy in any finite Markov decision process (MDP). Collaborative multiagent systems, though, are a challenge for self-interested agent implementation, as higher utility can be achieved via collaboration. To evaluate the Q-learning efficiency in collaborative multiagent systems, we will use a simplified version of the Malmo Collaborative AI Challenge (MCAC). It was designed by Microsoft and consists of a game where 2 players can collaborate to catch the pig (high reward) or leave the game (low reward). Each action costs 1, so knowing when to leave and when to chase the pig is key for achieving high scores. Two main problems are faced in the challenge: uncertainty of the other agent behaviour and a limited learning time. We propose solutions to both problems using a simplified MCAC environment, a stateaction abstraction and an agent type modelling. We have implemented an agent that is able to identify the other player behaviour (whether it is collaborating or not) and can learn an optimal policy against each type of player. Results show that Q-learning is an efficient and effective technique to solve collaborative multiagent systems.ca
dc.format.extent26 p.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/2445/124087
dc.language.isoengca
dc.rightsmemòria: cc-by-nc-sa (c) Alfred González Trastoy, 2018
dc.rightscodi: GPL (c) Alfred González Trastoy, 2018
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.rights.urihttp://www.gnu.org/licenses/gpl-3.0.ca.html*
dc.sourceTreballs Finals de Grau (TFG) - Enginyeria Informàtica
dc.subject.classificationAprenentatge automàticca
dc.subject.classificationIntel·ligència artificialca
dc.subject.classificationProgramarica
dc.subject.classificationTreballs de fi de grauca
dc.subject.classificationAprenentatge per reforç (Intel·ligència artificial)ca
dc.subject.classificationProcessos de Markovca
dc.subject.otherMachine learningen
dc.subject.otherArtificial intelligenceen
dc.subject.otherComputer softwareen
dc.subject.otherBachelor's thesesen
dc.subject.otherReinforcement learningen
dc.subject.otherMarkov processesen
dc.titleQ-learning in collaborative multiagent systemsca
dc.typeinfo:eu-repo/semantics/bachelorThesisca

Fitxers

Paquet original

Mostrant 1 - 2 de 2
Carregant...
Miniatura
Nom:
codi_font.zip
Mida:
656.72 KB
Format:
ZIP file
Descripció:
Codi font
Carregant...
Miniatura
Nom:
memoria.pdf
Mida:
1.29 MB
Format:
Adobe Portable Document Format
Descripció:
Memòria