Attention mechanisms in transformers: a new formula with mathematical foundations and enhanced interpretability

Vieiro Yanes, ArturoConti, Eddie2024-09-102024-09-102024-07-09https://hdl.handle.net/2445/215077Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2023-2024. Tutor: Arturo Vieiro Yanes i Oriol Pujol Vila[en] Large Language Models (LLMs) are AI systems capable of understanding and generating human language by processing vast amounts of text data. In recent years, specifically from 2017, the use of LLMs significantly increased thanks to the introduction of the Transformer architecture.35 p.application/pdfengcc-by-nc-nd (c) Eddie Conti, 2023codi: GPL (c) Eddie Conti, 2023http://creativecommons.org/licenses/by-nc-nd/3.0/es/http://www.gnu.org/licenses/gpl-3.0.ca.htmlTractament del llenguatge natural (Informàtica)Processament de dadesXarxes neuronals (Informàtica)Treballs de fi de màsterNatural language processing (Computer science)Data processingNeural networks (Computer science)Master's thesisAttention mechanisms in transformers: a new formula with mathematical foundations and enhanced interpretabilityinfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccess