Non-acted multi-view audio-visual dyadic Interactions. Project master thesis: multi-modal local and recurrent non-verbal emotion recognition in dyadic scenarios

Barco Terrones, Rubén

Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/159257

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Escalera Guerrero, Sergio	-
dc.contributor.advisor	Palmero, Cristina	-
dc.contributor.author	Barco Terrones, Rubén	-
dc.date.accessioned	2020-05-08T07:33:41Z	-
dc.date.available	2020-05-08T07:33:41Z	-
dc.date.issued	2019-09-02	-
dc.identifier.uri	http://hdl.handle.net/2445/159257	-
dc.description	Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona, Any: 2019, Tutor: Sergio Escalera Guerrero i Cristina Palmero	ca
dc.description.abstract	[en] In particular, this master thesis is focused on the development of baseline emotion recognition system in a dyadic environment using raw and handcraft audio features and cropped faces from the videos. This system is analyzed at frame and utterance level with and without temporal information. For this reason, an exhaustive study of the state-of-the-art on emotion recognition techniques has been conducted, paying particular attention on Deep Learning techniques for emotion recognition. While studying the state-of-the-art from the theoretical point of view, a dataset consisting of videos of sessions of dyadic interactions between individuals in different scenarios has been recorded. Different attributes were captured and labelled from these videos: body pose, hand pose, emotion, age, gender, etc. Once the architectures for emotion recognition have been trained with other dataset, a proof of concept is done with this new database in order to extract conclusions. In addition, this database can help future systems to achieve better results. A large number of experiments with audio and video are performed to create the emotion recognition system. The IEMOCAP database is used to perform the training and evaluation experiments of the emotion recognition system. Once the audio and video are trained separately with two different architectures, a fusion of both methods is done. In this work, the importance of preprocessing data (i.e. face detection, windows analysis length, handcrafted features, etc.) and choosing the correct parameters for the architectures (i.e. network depth, fusion, etc.) has been demonstrated and studied, while some experiments to study the influence of the temporal information are performed using some recurrent models for the spatiotemporal utterance level recognition of emotion. Finally, the conclusions drawn throughout this work are exposed, as well as the possible lines of future work including new systems for emotion recognition and the experiments with the database recorded in this work.	ca
dc.format.extent	65 p.	-
dc.format.mimetype	application/pdf	-
dc.language.iso	eng	ca
dc.rights	cc-by-sa (c) Rubén Barco Terrones, 2019	-
dc.rights	codi: GPL (c) Rubén Barco Terrones, 2019	-
dc.rights.uri	http://creativecommons.org/licenses/by-sa/3.0/es/	*
dc.rights.uri	http://www.gnu.org/licenses/gpl-3.0.ca.html	-
dc.source	Màster Oficial - Fonaments de la Ciència de Dades	-
dc.subject.classification	Aprenentatge automàtic	-
dc.subject.classification	Emocions	-
dc.subject.classification	Treballs de fi de màster	-
dc.subject.classification	Expressió facial	-
dc.subject.other	Machine learning	-
dc.subject.other	Emotions	-
dc.subject.other	Master's theses	-
dc.subject.other	Facial expression	-
dc.title	Non-acted multi-view audio-visual dyadic Interactions. Project master thesis: multi-modal local and recurrent non-verbal emotion recognition in dyadic scenarios	ca
dc.type	info:eu-repo/semantics/masterThesis	ca
dc.rights.accessRights	info:eu-repo/semantics/openAccess	ca
Appears in Collections:	Programari - Treballs de l'alumnat Màster Oficial - Fonaments de la Ciència de Dades

Files in This Item:

File	Description	Size	Format
159257.pdf	Memòria	21.99 MB	Adobe PDF	View/Open
codi_font.zip		1.99 MB	zip	View/Open

Show simple item record

This item is licensed under a Creative Commons License