Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/112223
Title: User behaviour analysis on Reddit
Author: Chen, Huang
Director: Garrido Ostermann, Lluís
Keywords: Dades massives
Comunitats virtuals
Programari
Mitjans socials
Conducta (Psicologia)
Tesis
Big data
Online social networks
Computer software
Human behavior
Social media
Theses
Issue Date: 30-Jun-2016
Abstract: Nowadays, Big Data and Data Science seem to be the most useful technologies for data processing used in a variety of areas, such as business, sports, social media, investigations, etc. But, what is Big Data? And, what about Data Science? Also, what is the difference between them? Big data is the challenge for data sets that are huge or complex which traditional data processing applications are inadequate to manage them in a reasonable time[1]. Hence, we can conclude Big data as a concept that refers to the storage of large amounts of data and the used procedures to find repetitive patterns within the data. Data Science is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, which is continuation of some of the data analysis fields such as statistic, data mining, and predictive analytics, similar to Knowledge Discovery in atabase(KDD). We compare them. Data Science looks to create models that capture the underlying patterns of complex systems, and codify those models into working application. Big Data looks to collect and manage large amounts of varied data to serve large-scale web applications and vast sensor networks. But, these terms are often used interchangeably despite having fundamentally different roles to play in bringing the potential of data to the doorstep of an organization[7]. As social media is now the fastest way to spread out information and it has grown up quickly in these recent years. So, it is interesting to know how is the database behind social media being processed. That is the reason I have been led to this topic and chose it for my Degree Final Project (TFG). Now, how are Big Data and Data Science applied in social media? Social media is the collective of online communications channels dedicated to community based input, interaction, content-sharing and collaboration. In other words, it is computer-mediated tool that allow people exchange information in virtual communities. Hence, it must have an enormous and complex database, because that information can be created and stored by multi formats files. For example, pictures, videos, texts, etc. Then, in pictures category, it has various extensions like PNG, JPEG, GIF, etc. Therefore, we need apply Big Data and Data Science concepts to manage and process them intelligently. However, in this project we are going to talk about just the small particular piece of social media, which is Reddit.
Note: Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2016, Director: Lluís Garrido Ostermann
URI: http://hdl.handle.net/2445/112223
Appears in Collections:Treballs Finals de Grau (TFG) - Enginyeria Informàtica
Programari - Treballs de l'alumnat

Files in This Item:
File Description SizeFormat 
memoria.pdfMemòria1.89 MBAdobe PDFView/Open
codi_font.zipCodi font238.28 kBzipView/Open


This item is licensed under a Creative Commons License Creative Commons