Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/185976
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorVitrià i Marca, Jordi-
dc.contributor.authorRas Jiménez, Blai-
dc.date.accessioned2022-05-24T09:59:04Z-
dc.date.available2022-05-24T09:59:04Z-
dc.date.issued2021-09-02-
dc.identifier.urihttp://hdl.handle.net/2445/185976-
dc.descriptionTreballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2020-2021. Tutor: Jordi Vitrià i Marcaca
dc.description.abstract[en] Predicting future sales is key for any business budgeting and resource allocation. One major concern when trying to build accurate forecasts are the cross-category relationships between some products and the effect that might have on each other’s sales. Given today’s data abundance, this issue is even more worrying: traditional statistic models can’t handle high-dimensional datasets with ten or more products. With the use of popular machine learning and data science tools, we developed a framework that enables the building, training and evaluation of two models and its comparison through a detailed set of forecast metrics 1 . The first model is a modified Vector Autoregressive model (VAR) which takes into account product relationships. The second one is an XGBoost model, which is not specialized into cross-category associations but it’s known for its versatility and performance when working with tabular data. After performing a one-month ahead sales forecasting on a huge dataset of multiple product sets, we find that inter-product connections play a huge role in prediction accuracy since the VAR model performed considerably much better than the XGBoost.ca
dc.format.extent56 p.-
dc.format.mimetypeapplication/pdf-
dc.language.isoengca
dc.rightscc-by-nc-nd (c) Blai Ras Jiménez, 2021-
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.sourceMàster Oficial - Fonaments de la Ciència de Dades-
dc.subject.classificationDades massives-
dc.subject.classificationAprenentatge automàtic-
dc.subject.classificationGestió de vendes-
dc.subject.classificationTreballs de fi de màster-
dc.subject.classificationAnàlisi multivariableca
dc.subject.otherBig data-
dc.subject.otherMachine learning-
dc.subject.otherSales management-
dc.subject.otherMaster's theses-
dc.subject.otherMultivariate analysisen
dc.titleAccuracy comparison between Sparse Autoregressive and XGBoost models for high-dimensional product sales forecastingca
dc.typeinfo:eu-repo/semantics/masterThesisca
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca
Appears in Collections:Màster Oficial - Fonaments de la Ciència de Dades

Files in This Item:
File Description SizeFormat 
tfm_ras_jimenez_blai.pdfMemòria1.31 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons