Accuracy comparison between Sparse Autoregressive and XGBoost models for high-dimensional product sales forecasting

dc.contributor.advisorVitrià i Marca, Jordi
dc.contributor.authorRas Jiménez, Blai
dc.date.accessioned2022-05-24T09:59:04Z
dc.date.available2022-05-24T09:59:04Z
dc.date.issued2021-09-02
dc.descriptionTreballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2020-2021. Tutor: Jordi Vitrià i Marcaca
dc.description.abstract[en] Predicting future sales is key for any business budgeting and resource allocation. One major concern when trying to build accurate forecasts are the cross-category relationships between some products and the effect that might have on each other’s sales. Given today’s data abundance, this issue is even more worrying: traditional statistic models can’t handle high-dimensional datasets with ten or more products. With the use of popular machine learning and data science tools, we developed a framework that enables the building, training and evaluation of two models and its comparison through a detailed set of forecast metrics 1 . The first model is a modified Vector Autoregressive model (VAR) which takes into account product relationships. The second one is an XGBoost model, which is not specialized into cross-category associations but it’s known for its versatility and performance when working with tabular data. After performing a one-month ahead sales forecasting on a huge dataset of multiple product sets, we find that inter-product connections play a huge role in prediction accuracy since the VAR model performed considerably much better than the XGBoost.ca
dc.format.extent56 p.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/2445/185976
dc.language.isoengca
dc.rightscc-by-nc-nd (c) Blai Ras Jiménez, 2021
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.sourceMàster Oficial - Fonaments de la Ciència de Dades
dc.subject.classificationDades massives
dc.subject.classificationAprenentatge automàtic
dc.subject.classificationGestió de vendes
dc.subject.classificationTreballs de fi de màster
dc.subject.classificationAnàlisi multivariableca
dc.subject.otherBig data
dc.subject.otherMachine learning
dc.subject.otherSales management
dc.subject.otherMaster's theses
dc.subject.otherMultivariate analysisen
dc.titleAccuracy comparison between Sparse Autoregressive and XGBoost models for high-dimensional product sales forecastingca
dc.typeinfo:eu-repo/semantics/masterThesisca

Fitxers

Paquet original

Mostrant 1 - 1 de 1
Carregant...
Miniatura
Nom:
tfm_ras_jimenez_blai.pdf
Mida:
1.28 MB
Format:
Adobe Portable Document Format
Descripció:
Memòria