Please use this identifier to cite or link to this item:
http://hdl.handle.net/2445/182589
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Salamó Llorente, Maria | - |
dc.contributor.author | Sánchez Lladó, Ferran | - |
dc.date.accessioned | 2022-01-26T10:19:31Z | - |
dc.date.available | 2022-01-26T10:19:31Z | - |
dc.date.issued | 2021-07-20 | - |
dc.identifier.uri | http://hdl.handle.net/2445/182589 | - |
dc.description | Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2021, Director: Maria Salamó Llorente | ca |
dc.description.abstract | [en] The presence of social networks has increased in our daily lives and have become platforms for sharing information. But, it also can be used for sending hate messages or for propagating false news. Users can take advantage of their anonymity to provide these toxic interactions. Furthermore, some groups of people (minorities) get disproportionately more targeted than the rest. This raises the problem of how to detect if a message contains hate speech. A solution could be the use of machine learning models that would be in charge of this decision. In addition, it could handle the enormous amount of texts interchanged daily. However, there are many approaches to tackle the problem, which are divided mainly into two groups. The first one is through the use of classical algorithms to extract information from the text. The other one is through the use of deep learning models that can understand some context that allows for better predictions. The main objectives of the project are the exploration and comparison of different types of models and techniques. The diverse models are trained with three distinct toxicity datasets, of two natural language processing competitions. Generally, the best performing model is BERT or SBERT, both models based on the deep learning approach, with metric scores much higher than any model based on the traditional methods. The results show the vast potential of Natural Language Processing for the detection of hate speech. Although the best models did not have a very high perplexity, a more reliable model could be trained with more training data or new architectures. Even at the current state, the models could be used as an external font for helping humans in the decision-making process. Moreover, these models could filter the most confident predictions while leaving the rest for the reviewer team. | ca |
dc.format.extent | 67 p. | - |
dc.format.mimetype | application/pdf | - |
dc.language.iso | eng | ca |
dc.rights | memòria: cc-nc-nd (c) Ferran Sánchez Lladó, 2021 | - |
dc.rights.uri | http://creativecommons.org/licenses/by-sa/3.0/es/ | * |
dc.source | Treballs Finals de Grau (TFG) - Enginyeria Informàtica | - |
dc.subject.classification | Xarxes socials | ca |
dc.subject.classification | Discurs de l'odi | ca |
dc.subject.classification | Programari | ca |
dc.subject.classification | Treballs de fi de grau | ca |
dc.subject.classification | Aprenentatge automàtic | ca |
dc.subject.classification | Algorismes computacionals | ca |
dc.subject.classification | Tractament del llenguatge natural (Informàtica) | ca |
dc.subject.other | Social networks | en |
dc.subject.other | Hate speech | en |
dc.subject.other | Computer software | en |
dc.subject.other | Machine learning | en |
dc.subject.other | Computer algorithms | en |
dc.subject.other | Bachelor's theses | en |
dc.subject.other | Natural language processing (Computer science) | en |
dc.title | Analysis of hate speech detection in social media | ca |
dc.type | info:eu-repo/semantics/bachelorThesis | ca |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | ca |
Appears in Collections: | Treballs Finals de Grau (TFG) - Enginyeria Informàtica |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
tfg_ferran_sanchez_llado.pdf | Memòria | 2.19 MB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License