Accuracy comparison between Sparse Autoregressive and XGBoost models for high-dimensional product sales forecasting
| dc.contributor.advisor | Vitrià i Marca, Jordi | |
| dc.contributor.author | Ras Jiménez, Blai | |
| dc.date.accessioned | 2022-05-24T09:59:04Z | |
| dc.date.available | 2022-05-24T09:59:04Z | |
| dc.date.issued | 2021-09-02 | |
| dc.description | Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2020-2021. Tutor: Jordi Vitrià i Marca | ca |
| dc.description.abstract | [en] Predicting future sales is key for any business budgeting and resource allocation. One major concern when trying to build accurate forecasts are the cross-category relationships between some products and the effect that might have on each other’s sales. Given today’s data abundance, this issue is even more worrying: traditional statistic models can’t handle high-dimensional datasets with ten or more products. With the use of popular machine learning and data science tools, we developed a framework that enables the building, training and evaluation of two models and its comparison through a detailed set of forecast metrics 1 . The first model is a modified Vector Autoregressive model (VAR) which takes into account product relationships. The second one is an XGBoost model, which is not specialized into cross-category associations but it’s known for its versatility and performance when working with tabular data. After performing a one-month ahead sales forecasting on a huge dataset of multiple product sets, we find that inter-product connections play a huge role in prediction accuracy since the VAR model performed considerably much better than the XGBoost. | ca |
| dc.format.extent | 56 p. | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.uri | https://hdl.handle.net/2445/185976 | |
| dc.language.iso | eng | ca |
| dc.rights | cc-by-nc-nd (c) Blai Ras Jiménez, 2021 | |
| dc.rights.accessRights | info:eu-repo/semantics/openAccess | ca |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ | * |
| dc.source | Màster Oficial - Fonaments de la Ciència de Dades | |
| dc.subject.classification | Dades massives | |
| dc.subject.classification | Aprenentatge automàtic | |
| dc.subject.classification | Gestió de vendes | |
| dc.subject.classification | Treballs de fi de màster | |
| dc.subject.classification | Anàlisi multivariable | ca |
| dc.subject.other | Big data | |
| dc.subject.other | Machine learning | |
| dc.subject.other | Sales management | |
| dc.subject.other | Master's theses | |
| dc.subject.other | Multivariate analysis | en |
| dc.title | Accuracy comparison between Sparse Autoregressive and XGBoost models for high-dimensional product sales forecasting | ca |
| dc.type | info:eu-repo/semantics/masterThesis | ca |
Fitxers
Paquet original
1 - 1 de 1
Carregant...
- Nom:
- tfm_ras_jimenez_blai.pdf
- Mida:
- 1.28 MB
- Format:
- Adobe Portable Document Format
- Descripció:
- Memòria