Egocentric video description based on temporally-linked sequences

Bolaños Solà, Marc; Peris, Álvaro; Casacuberta, Francisco; Soler, Sergi; Radeva, Petia

Egocentric video description based on temporally-linked sequences

dc.contributor.author	Bolaños Solà, Marc
dc.contributor.author	Peris, Álvaro
dc.contributor.author	Casacuberta, Francisco
dc.contributor.author	Soler, Sergi
dc.contributor.author	Radeva, Petia
dc.date.accessioned	2019-10-25T10:11:47Z
dc.date.available	2020-01-31T06:10:17Z
dc.date.issued	2018-01
dc.date.updated	2019-10-25T10:11:48Z
dc.description.abstract	Egocentric vision consists in acquiring images along the day from a first person point-of-view using wearable cameras. The automatic analysis of this information allows to discover daily patterns for improving the quality of life of the user. A natural topic that arises in egocentric vision is storytelling, that is, how to understand and tell the story relying behind the pictures. In this paper, we tackle storytelling as an egocentric sequences description problem. We propose a novel methodology that exploits information from temporally neighboring events, matching precisely the nature of egocentric sequences. Furthermore, we present a new method for multimodal data fusion consisting on a multi-input attention recurrent network. We also release the EDUB-SegDesc dataset. This is the first dataset for egocentric image sequences description, consisting of 1339 events with 3991 descriptions, from 55 days acquired by 11 people. Finally, we prove that our proposal outperforms classical attentional encoder-decoder methods for video description.
dc.format.extent	12 p.
dc.format.mimetype	application/pdf
dc.identifier.idgrec	684160
dc.identifier.issn	1047-3203
dc.identifier.uri	https://hdl.handle.net/2445/143165
dc.language.iso	eng
dc.publisher	Elsevier
dc.relation.isformatof	Versió postprint del document publicat a:
dc.relation.ispartof	Journal of Visual Communication and Image Representation, 2018, vol. 50, p. 205-216
dc.rights	cc-by-nc-nd (c) Academic Press , 2018
dc.rights.accessRights	info:eu-repo/semantics/openAccess
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/es
dc.source	Articles publicats en revistes (Matemàtiques i Informàtica)
dc.subject.classification	Aprenentatge visual
dc.subject.classification	Vídeo en l'ensenyament
dc.subject.other	Visual learning
dc.subject.other	Video tapes in education
dc.title	Egocentric video description based on temporally-linked sequences
dc.type	info:eu-repo/semantics/article
dc.type	info:eu-repo/semantics/acceptedVersion

Fitxers

Paquet original

Mostrant 1 - 1 de 1

Nom:: 684160.pdf
Mida:: 3.08 MB
Format:: Adobe Portable Document Format

Descarregar

Col·leccions

Articles publicats en revistes (Matemàtiques i Informàtica)