News similarity with natural language processing

dc.contributor.advisorVitrià i Marca, Jordi
dc.contributor.authorParafita Martínez, Álvaro
dc.date.accessioned2016-04-15T10:40:51Z
dc.date.available2016-04-15T10:40:51Z
dc.date.issued2016-01-28
dc.descriptionTreballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2016, Director: Jordi Vitrià i Marcaca
dc.description.abstractNews articles are pieces of Natural Language that comply with the model of 5W1H, meaning, they should answer to the following six questions: What, Who, Where, When, Why and How. This project takes advantage of that assumption to create an algorithm capable of building a representation of a news article and a distance between such representations for any pair of politics news. With that knowledge, a global dis- tance between entries based on similarity of content is built. That algorithm is assessed in comparison with the topic modeling algorithm Latent Dirichlet Allocation (LDA). Applications of the system with their corresponding visualisations are presented too.ca
dc.format.extent57 p.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/2445/97486
dc.language.isoengca
dc.rightsmemòria: cc-by-nc-sa (c) Álvaro Parafita Martínez, 2016
dc.rightscodi: GPL (c) Álvaro Parafita Martínez, 2016
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca
dc.rights.urihttp://creativecommons.org/licenses/by-sa/3.0/es
dc.rights.urihttp://www.gnu.org/licenses/gpl-3.0.ca.html
dc.sourceTreballs Finals de Grau (TFG) - Enginyeria Informàtica
dc.subject.classificationTractament del llenguatge natural (Informàtica)cat
dc.subject.classificationIntel·ligència artificialcat
dc.subject.classificationProgramaricat
dc.subject.classificationTreballs de fi de graucat
dc.subject.classificationAlgorismes computacionalsca
dc.subject.classificationPython (Llenguatge de programació)ca
dc.subject.otherNatural language processing (Computer science)eng
dc.subject.otherArtificial intelligenceeng
dc.subject.otherComputer softwareeng
dc.subject.otherBachelor's theseseng
dc.subject.otherComputer algorithmseng
dc.subject.otherPython (Computer program language)eng
dc.titleNews similarity with natural language processingeng
dc.typeinfo:eu-repo/semantics/bachelorThesisca

Fitxers

Paquet original

Mostrant 1 - 2 de 2
Carregant...
Miniatura
Nom:
memoria.pdf
Mida:
2.3 MB
Format:
Adobe Portable Document Format
Descripció:
Memòria
Carregant...
Miniatura
Nom:
codi_font.zip
Mida:
2.34 MB
Format:
ZIP file
Descripció:
Codi font