Egocentric video description based on temporally-linked sequences

Bolaños Solà, Marc; Peris, Álvaro; Casacuberta, Francisco; Soler, Sergi; Radeva, Petia

Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/143165

Full metadata record

DC Field	Value	Language
dc.contributor.author	Bolaños Solà, Marc	-
dc.contributor.author	Peris, Álvaro	-
dc.contributor.author	Casacuberta, Francisco	-
dc.contributor.author	Soler, Sergi	-
dc.contributor.author	Radeva, Petia	-
dc.date.accessioned	2019-10-25T10:11:47Z	-
dc.date.available	2020-01-31T06:10:17Z	-
dc.date.issued	2018-01	-
dc.identifier.issn	1047-3203	-
dc.identifier.uri	http://hdl.handle.net/2445/143165	-
dc.description.abstract	Egocentric vision consists in acquiring images along the day from a first person point-of-view using wearable cameras. The automatic analysis of this information allows to discover daily patterns for improving the quality of life of the user. A natural topic that arises in egocentric vision is storytelling, that is, how to understand and tell the story relying behind the pictures. In this paper, we tackle storytelling as an egocentric sequences description problem. We propose a novel methodology that exploits information from temporally neighboring events, matching precisely the nature of egocentric sequences. Furthermore, we present a new method for multimodal data fusion consisting on a multi-input attention recurrent network. We also release the EDUB-SegDesc dataset. This is the first dataset for egocentric image sequences description, consisting of 1339 events with 3991 descriptions, from 55 days acquired by 11 people. Finally, we prove that our proposal outperforms classical attentional encoder-decoder methods for video description.	-
dc.format.extent	12 p.	-
dc.format.mimetype	application/pdf	-
dc.language.iso	eng	-
dc.publisher	Elsevier	-
dc.relation.isformatof	Versió postprint del document publicat a:	-
dc.relation.ispartof	Journal of Visual Communication and Image Representation, 2018, vol. 50, p. 205-216	-
dc.rights	cc-by-nc-nd (c) Academic Press , 2018	-
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/es	-
dc.source	Articles publicats en revistes (Matemàtiques i Informàtica)	-
dc.subject.classification	Aprenentatge visual	-
dc.subject.classification	Vídeo en l'ensenyament	-
dc.subject.other	Visual learning	-
dc.subject.other	Video tapes in education	-
dc.title	Egocentric video description based on temporally-linked sequences	-
dc.type	info:eu-repo/semantics/article	-
dc.type	info:eu-repo/semantics/acceptedVersion	-
dc.identifier.idgrec	684160	-
dc.date.updated	2019-10-25T10:11:48Z	-
dc.rights.accessRights	info:eu-repo/semantics/openAccess	-
Appears in Collections:	Articles publicats en revistes (Matemàtiques i Informàtica)

Files in This Item:

File	Description	Size	Format
684160.pdf		3.16 MB	Adobe PDF	View/Open

Show simple item record

This item is licensed under a Creative Commons License