Clapés i Sintes, AlbertEscalera Guerrero, SergioYuste Ramos, Joaquim2022-01-312022-01-312021-06-20https://hdl.handle.net/2445/182804Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2021, Director: Albert Clapés i Sergio Escalera Guerrero[en] This project focuses on video action segmentation task, which aims to temporally segment and classify fine-grained actions in untrimmed videos. The development and refinement of this process is an important yet challenging problem, which can provide great improvements in work areas such as robotics, e-Health assistive technologies, surveillance, and beyond. On the one hand, we will study the current state-of-the-art, as well as the metrics that are commonly used to evaluate an architecture on this kind of problems. On the other hand, we introduce two different attention-based modules that are capable of extracting frame-to-frame relationships, and a behaviour analysis will be performed by evaluating them over Georgia Tech Egocentric Activity (GTEA), which is an outstanding dataset. This dataset is focused on daily cooking activity videos, with fine-grained labels, and it has an egocentric point view. Eventually, we will compare the obtained results against the actual state-of-the-art scores, in order to discuss the effectiveness of each module.45 p.application/pdfengmemòria: cc-nc-nd (c) Joaquim Yuste Ramos, 2021codi: MIT License (c) Joaquim Yuste Ramos, 2021http://creativecommons.org/licenses/by-nc-nd/3.0/es/https://opensource.org/licenses/MITAprenentatge automàticVisió per ordinadorProgramariTreballs de fi de grauXarxes neuronals convolucionalsReconeixement de formes (Informàtica)Xarxes neuronals (Informàtica)Machine learningComputer visionComputer softwareConvolutional neural networksPattern recognition systemsBachelor's thesesNeural networks (Computer science)Using deep learning for fine-grained action segmentationinfo:eu-repo/semantics/bachelorThesisinfo:eu-repo/semantics/openAccess