A multimodal deep learning approach for food tray recognition

dc.contributor.advisorBolaños, Marc
dc.contributor.advisorRadeva, Petia
dc.contributor.authorPeracaula Prat, Joan
dc.date.accessioned2021-02-08T09:31:55Z
dc.date.available2021-02-08T09:31:55Z
dc.date.issued2020-09-13
dc.descriptionTreballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2020, Director: Marc Bolaños i Petia Radevaca
dc.description.abstract[en] Food recognition, object detection and classification applied to the food domain, is the main topic of this work. We have studied the problem of recognising food instances in tray images of self-service restaurants and have proposed a novel multimodal deep learning approach. From images and daily menus, the model presented uses two state of the art models in object detection and classification and a multimodal neural network to make significantly refined predictions compared to the baseline object detection model, achieving a class weighted average F1-score of 0.862. An ensemble model built from the proposed and the baseline models, also presented in this work, improves the results achieving a class weighted average F1-score of 0.877.ca
dc.format.extent81 p.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/2445/173728
dc.language.isoengca
dc.rightsmemòria: cc-nc-nd (c) Joan Peracaula Prat, 2020
dc.rightscodi: GPL (c) Joan Peracaula Prat, 2019
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.rights.urihttp://www.gnu.org/licenses/gpl-3.0.ca.html*
dc.sourceTreballs Finals de Grau (TFG) - Enginyeria Informàtica
dc.subject.classificationXarxes neuronals (Informàtica)ca
dc.subject.classificationAprenentatge automàticca
dc.subject.classificationProgramarica
dc.subject.classificationTreballs de fi de grauca
dc.subject.classificationProcessament digital d'imatgesca
dc.subject.classificationVisió per ordinadorca
dc.subject.classificationAlimentsca
dc.subject.otherNeural networks (Computer science)en
dc.subject.otherMachine learningen
dc.subject.otherComputer softwareen
dc.subject.otherDigital image processingen
dc.subject.otherComputer visionen
dc.subject.otherBachelor's thesesen
dc.subject.otherFooden
dc.titleA multimodal deep learning approach for food tray recognitionca
dc.typeinfo:eu-repo/semantics/bachelorThesisca

Fitxers

Paquet original

Mostrant 1 - 2 de 2
Carregant...
Miniatura
Nom:
codi.zip
Mida:
3.82 MB
Format:
ZIP file
Descripció:
Codi font
Carregant...
Miniatura
Nom:
173728.pdf
Mida:
7.66 MB
Format:
Adobe Portable Document Format
Descripció:
Memòria