Please use this identifier to cite or link to this item:
http://hdl.handle.net/2445/185976
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Vitrià i Marca, Jordi | - |
dc.contributor.author | Ras Jiménez, Blai | - |
dc.date.accessioned | 2022-05-24T09:59:04Z | - |
dc.date.available | 2022-05-24T09:59:04Z | - |
dc.date.issued | 2021-09-02 | - |
dc.identifier.uri | http://hdl.handle.net/2445/185976 | - |
dc.description | Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2020-2021. Tutor: Jordi Vitrià i Marca | ca |
dc.description.abstract | [en] Predicting future sales is key for any business budgeting and resource allocation. One major concern when trying to build accurate forecasts are the cross-category relationships between some products and the effect that might have on each other’s sales. Given today’s data abundance, this issue is even more worrying: traditional statistic models can’t handle high-dimensional datasets with ten or more products. With the use of popular machine learning and data science tools, we developed a framework that enables the building, training and evaluation of two models and its comparison through a detailed set of forecast metrics 1 . The first model is a modified Vector Autoregressive model (VAR) which takes into account product relationships. The second one is an XGBoost model, which is not specialized into cross-category associations but it’s known for its versatility and performance when working with tabular data. After performing a one-month ahead sales forecasting on a huge dataset of multiple product sets, we find that inter-product connections play a huge role in prediction accuracy since the VAR model performed considerably much better than the XGBoost. | ca |
dc.format.extent | 56 p. | - |
dc.format.mimetype | application/pdf | - |
dc.language.iso | eng | ca |
dc.rights | cc-by-nc-nd (c) Blai Ras Jiménez, 2021 | - |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ | * |
dc.source | Màster Oficial - Fonaments de la Ciència de Dades | - |
dc.subject.classification | Dades massives | - |
dc.subject.classification | Aprenentatge automàtic | - |
dc.subject.classification | Gestió de vendes | - |
dc.subject.classification | Treballs de fi de màster | - |
dc.subject.classification | Anàlisi multivariable | ca |
dc.subject.other | Big data | - |
dc.subject.other | Machine learning | - |
dc.subject.other | Sales management | - |
dc.subject.other | Master's theses | - |
dc.subject.other | Multivariate analysis | en |
dc.title | Accuracy comparison between Sparse Autoregressive and XGBoost models for high-dimensional product sales forecasting | ca |
dc.type | info:eu-repo/semantics/masterThesis | ca |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | ca |
Appears in Collections: | Màster Oficial - Fonaments de la Ciència de Dades |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
tfm_ras_jimenez_blai.pdf | Memòria | 1.31 MB | Adobe PDF | View/Open |
This item is licensed under a
Creative Commons License