Instilling moral value alignment by means of multi-objective reinforcement learning

Rodriguez Soto, Manel; Serramia, Marc; López Sánchez, Maite; Rodríguez-Aguilar, Juan A. (Juan Antonio)

Please use this identifier to cite or link to this item: https://hdl.handle.net/2445/192920

Full metadata record

DC Field	Value	Language
dc.contributor.author	Rodriguez Soto, Manel	-
dc.contributor.author	Serramia, Marc	-
dc.contributor.author	López Sánchez, Maite	-
dc.contributor.author	Rodríguez-Aguilar, Juan A. (Juan Antonio)	-
dc.date.accessioned	2023-02-01T09:10:35Z	-
dc.date.available	2023-02-01T09:10:35Z	-
dc.date.issued	2022-01-24	-
dc.identifier.issn	1388-1957	-
dc.identifier.uri	https://hdl.handle.net/2445/192920	-
dc.description.abstract	AI research is being challenged with ensuring that autonomous agents learn to behave ethically, namely in alignment with moral values. Here, we propose a novel way of tackling the value alignment problem as a two-step process. The first step consists on formalising moral values and value aligned behaviour based on philosophical foundations. Our formalisation is compatible with the framework of (Multi-Objective) Reinforcement Learning, to ease the handling of an agent's individual and ethical objectives. The second step consists in designing an environment wherein an agent learns to behave ethically while pursuing its individual objective. We leverage on our theoretical results to introduce an algorithm that automates our two-step approach. In the cases where value-aligned behaviour is possible, our algorithm produces a learning environment for the agent wherein it will learn a value-aligned behaviour.	-
dc.format.extent	17 p.	-
dc.format.mimetype	application/pdf	-
dc.language.iso	eng	-
dc.publisher	Springer	-
dc.relation.isformatof	Reproducció del document publicat a: https://doi.org/10.1007/s10676-022-09635-0	-
dc.relation.ispartof	Ethics And Information Technology, 2022, vol. 24	-
dc.relation.uri	https://doi.org/10.1007/s10676-022-09635-0	-
dc.rights	cc by (c) Manel Rodríguez Soto et al., 2022	-
dc.rights.uri	http://creativecommons.org/licenses/by/3.0/es/	*
dc.source	Articles publicats en revistes (Matemàtiques i Informàtica)	-
dc.subject.classification	Intel·ligència artificial	-
dc.subject.classification	Aprenentatge per reforç (Intel·ligència artificial)	-
dc.subject.classification	Ètica	-
dc.subject.classification	Aspectes morals	-
dc.subject.other	Artificial intelligence	-
dc.subject.other	Reinforcement learning	-
dc.subject.other	Ethics	-
dc.subject.other	Moral aspects	-
dc.title	Instilling moral value alignment by means of multi-objective reinforcement learning	-
dc.type	info:eu-repo/semantics/article	-
dc.type	info:eu-repo/semantics/publishedVersion	-
dc.identifier.idgrec	715848	-
dc.date.updated	2023-02-01T09:10:35Z	-
dc.rights.accessRights	info:eu-repo/semantics/openAccess	-
Appears in Collections:	Articles publicats en revistes (Matemàtiques i Informàtica)

Files in This Item:

File	Description	Size	Format
715848.pdf		1.86 MB	Adobe PDF	View/Open

Show simple item record

This item is licensed under a Creative Commons License