Please use this identifier to cite or link to this item:
http://hdl.handle.net/2445/186800
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Radeva, Petia | - |
dc.contributor.advisor | Tatjer i Montaña, Joan Carles | - |
dc.contributor.author | Rial Figols, David | - |
dc.date.accessioned | 2022-06-20T07:00:17Z | - |
dc.date.available | 2022-06-20T07:00:17Z | - |
dc.date.issued | 2022-01-24 | - |
dc.identifier.uri | http://hdl.handle.net/2445/186800 | - |
dc.description | Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2022, Director: Petia Radeva i Joan Carles Tatjer i Montaña | ca |
dc.description.abstract | [en] The field of Deep Learning is constantly evolving to optimize the models that are developed and achieve a specific task with the best possible accuracy. It was in 2017 when Vaswani introduced a new neural network structure that would allow for a new evolution: transformers. Based on the concept of attention, introduced in 2014, transformers were able to quickly impose themselves on all-natural language processing tasks. It was not until 2020 that transformers, applied to image-related tasks started to be competitive. Furthermore, within less than two years they have been able to overcome the previous models of neural networks architectures, to end up prevailing with the best results. Among these image tasks, the problem of image classification stands out, which is to assign each image a label that describes it, as it is a problem that has historically been used to describe the evolution of Deep Learning and the progress made. Despite being at the forefront of all the tasks mentioned, transformers, especially image transformers, are still a black box as to why they learn or what makes one transformer better than another. It is for this reason that this work is based on the study of transformers. Specifically, this paper aims to introduce the transformers and the basics needed to understand how they work, and then to understand and investigate how transformers learn dedicated to image classification, looking for their similarities and differences, and how they are characterized. In this work, we have proposed a comparative framework for transformers dedicated to the problem of image classification, based on the results achieved with the transformers ViT, BEiT, DeiT, Swin and CSWin on the set of food images Food-101. This comparison framework is based on the different properties and evolution of the various weight matrices that make up these transformers, aided by the unique values of the matrices and their rules and ranges. At the same time, a form of training is also suggested to make it faster in a specific data set, reducing the time by 33% without losing accuracy. | ca |
dc.format.extent | 54 p. | - |
dc.format.mimetype | application/pdf | - |
dc.language.iso | cat | ca |
dc.rights | memòria: cc-nc-nd (c) David Rial Figols, 2022 | - |
dc.rights | codi: GPL (c) David Rial Figols, 2022 | - |
dc.rights.uri | http://www.gnu.org/licenses/gpl-3.0.ca.html | - |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ | * |
dc.source | Treballs Finals de Grau (TFG) - Enginyeria Informàtica | - |
dc.subject.classification | Aprenentatge automàtic | ca |
dc.subject.classification | Xarxes neuronals (Informàtica) | ca |
dc.subject.classification | Programari | ca |
dc.subject.classification | Treballs de fi de grau | ca |
dc.subject.classification | Visió per ordinador | ca |
dc.subject.classification | Sistemes classificadors (Intel·ligència artificial) | ca |
dc.subject.classification | Aliments | - |
dc.subject.other | Machine learning | en |
dc.subject.other | Neural networks (Computer science) | en |
dc.subject.other | Computer software | en |
dc.subject.other | Computer vision | en |
dc.subject.other | Learning classifier systems | en |
dc.subject.other | Bachelor's theses | en |
dc.subject.other | Food | - |
dc.title | Un nou marc per analitzar transformers de visió. Aplicació a l'anàlisi d'imatges de menjar | ca |
dc.type | info:eu-repo/semantics/bachelorThesis | ca |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | ca |
Appears in Collections: | Treballs Finals de Grau (TFG) - Enginyeria Informàtica Treballs Finals de Grau (TFG) - Matemàtiques Programari - Treballs de l'alumnat |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
codi.zip | Codi font | 20.41 kB | zip | View/Open |
tfg_rial_figols_david.pdf | Memòria | 5.3 MB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License