Egocentric video description based on temporally-linked sequences

dc.contributor.authorBolaños Solà, Marc
dc.contributor.authorPeris, Álvaro
dc.contributor.authorCasacuberta, Francisco
dc.contributor.authorSoler, Sergi
dc.contributor.authorRadeva, Petia
dc.date.accessioned2019-10-25T10:11:47Z
dc.date.available2020-01-31T06:10:17Z
dc.date.issued2018-01
dc.date.updated2019-10-25T10:11:48Z
dc.description.abstractEgocentric vision consists in acquiring images along the day from a first person point-of-view using wearable cameras. The automatic analysis of this information allows to discover daily patterns for improving the quality of life of the user. A natural topic that arises in egocentric vision is storytelling, that is, how to understand and tell the story relying behind the pictures. In this paper, we tackle storytelling as an egocentric sequences description problem. We propose a novel methodology that exploits information from temporally neighboring events, matching precisely the nature of egocentric sequences. Furthermore, we present a new method for multimodal data fusion consisting on a multi-input attention recurrent network. We also release the EDUB-SegDesc dataset. This is the first dataset for egocentric image sequences description, consisting of 1339 events with 3991 descriptions, from 55 days acquired by 11 people. Finally, we prove that our proposal outperforms classical attentional encoder-decoder methods for video description.
dc.format.extent12 p.
dc.format.mimetypeapplication/pdf
dc.identifier.idgrec684160
dc.identifier.issn1047-3203
dc.identifier.urihttps://hdl.handle.net/2445/143165
dc.language.isoeng
dc.publisherElsevier
dc.relation.isformatofVersió postprint del document publicat a:
dc.relation.ispartofJournal of Visual Communication and Image Representation, 2018, vol. 50, p. 205-216
dc.rightscc-by-nc-nd (c) Academic Press , 2018
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es
dc.sourceArticles publicats en revistes (Matemàtiques i Informàtica)
dc.subject.classificationAprenentatge visual
dc.subject.classificationVídeo en l'ensenyament
dc.subject.otherVisual learning
dc.subject.otherVideo tapes in education
dc.titleEgocentric video description based on temporally-linked sequences
dc.typeinfo:eu-repo/semantics/article
dc.typeinfo:eu-repo/semantics/acceptedVersion

Fitxers

Paquet original

Mostrant 1 - 1 de 1
Carregant...
Miniatura
Nom:
684160.pdf
Mida:
3.08 MB
Format:
Adobe Portable Document Format