Alineació de paraules i mecanismes d'atenció en sistemes de traducció automàtica neuronal

Safont Gascón, Pol

Please use this identifier to cite or link to this item: https://hdl.handle.net/2445/187820

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Ortiz Martínez, Daniel	-
dc.contributor.author	Safont Gascón, Pol	-
dc.date.accessioned	2022-07-18T08:34:21Z	-
dc.date.available	2022-07-18T08:34:21Z	-
dc.date.issued	2022-01-24	-
dc.identifier.uri	https://hdl.handle.net/2445/187820	-
dc.description	Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2022, Director: Daniel Ortiz Martínez	ca
dc.description.abstract	[en] Deep Neural Networks have become the state of the art in many complex computational tasks. While they achieve great improvements over several benchmarking tasks year after year, they seem to operate as black boxes, making it hard for both data scientist and end users to assess their inner decision mechanisms and trust their results. While statistical and interpretable methods are widely used to analyze them, they don’t fully grasp their internal mechanisms and are prone to misleading results, leading to a need for better tools. As a result, self-explaining methods embedded inside the architecture of the neural networks have become a possible alternative, with attention mechanisms as one of the main new technics. The project main focus is the word alignment task, finding the most relevant translation relationships between source and target words in a pair of parallel sentences in different languages. This is a complex task of the Natural Language Processing and machine translation field, and we analyze the use of the novel attention mechanisms embedded in different encoder-decoder neural networks in order to extract the word to word alignments between source and target translations as a byproduct of the translation task. In the first part we analyze the background of the machine translation field: the main traditional statistical methods, the neural machine translation approach to the sequence to sequence problem and finally the word align task and the attention mechanism. In the second part, we implement a machine translation deep neural networks model: a recurrent neural network with an encoder-decoder architecture with attention. And we propose an alignment generation mechanism using the attention layer in order to extract and predict source to target word to word alignments. Finally, we train the neural networks with an English and French bilingual parallel sentence corpus and analyze the experimental results of the model for the translation and align word to word tasks, using a variety of metrics and suggest improvements and alternatives.	ca
dc.format.extent	66 p.	-
dc.format.mimetype	application/pdf	-
dc.language.iso	cat	ca
dc.rights	memòria: cc-nc-nd (c) Pol Safont Gascón, 2022	-
dc.rights	codi: GPL (c) Pol Safont Gascón, 2022	-
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/es/	-
dc.rights.uri	http://www.gnu.org/licenses/gpl-3.0.ca.html	-
dc.source	Treballs Finals de Grau (TFG) - Enginyeria Informàtica	-
dc.subject.classification	Xarxes neuronals (Informàtica)	ca
dc.subject.classification	Traducció automàtica	ca
dc.subject.classification	Programari	ca
dc.subject.classification	Treballs de fi de grau	ca
dc.subject.classification	Tractament del llenguatge natural (Informàtica)	ca
dc.subject.classification	Aprenentatge automàtic	ca
dc.subject.other	Neural networks (Computer science)	en
dc.subject.other	Machine translating	en
dc.subject.other	Computer software	en
dc.subject.other	Natural language processing (Computer science)	en
dc.subject.other	Machine learning	en
dc.subject.other	Bachelor's theses	en
dc.title	Alineació de paraules i mecanismes d'atenció en sistemes de traducció automàtica neuronal	ca
dc.type	info:eu-repo/semantics/bachelorThesis	ca
dc.rights.accessRights	info:eu-repo/semantics/openAccess	ca
Appears in Collections:	Programari - Treballs de l'alumnat Treballs Finals de Grau (TFG) - Enginyeria Informàtica

Files in This Item:

File	Description	Size	Format
codi.zip	Codi font	377.21 kB	zip	View/Open
tfg_safont_gascon_pol.pdf	Memòria	2.62 MB	Adobe PDF	View/Open

Show simple item record

This item is licensed under a Creative Commons License