Apply Machine Learning in the Company to Predict the Quality of Sales Leads

Solé Casaramona, Jordi

Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/180483

Title:	Apply Machine Learning in the Company to Predict the Quality of Sales Leads
Author:	Solé Casaramona, Jordi
Director/Tutor:	Llorens Latorre, Xavier
Keywords:	Aprenentatge automàtic Previsió dels negocis Treballs de fi de màster Gestió de vendes Sistemes classificadors (Intel·ligència artificial) Machine learning Business forecasting Master's theses Sales management Learning classifier systems
Issue Date:	31-Aug-2020
Abstract:	[en] Many organizations are still driven by intuition and experience-based decision making. With this type of decisions, problems such as human bias, loss of experienced workers, and the reluctance to use more sophisticated information systems can be a severe problem. With the arrival of the era of data, companies have at their disposal more information than never before, but not many know how to use this resource to its full potential. In this work, we are going to develop a data science pipeline to predict the quality of the sales leads for the EMEA 3D sales department in HP, a project that aims to enhance the transition to a data-driven decision-making organization. In order to solve this problem, the developed pipeline was focused on two tasks. The first, involved developing a web scraping tool to obtain information not previously available on the company database or that was very time consuming to acquire due to the size of the database, of more than 40,000 leads. And second, the training of a machine learning algorithm to predict a score quality together with an explainability of the main features of the decision for every lead. The result of this process greatly impacted the business, all the knowledge was kept always in the company inside the machine learning model, and the explanations of each decision are making gain confidence in the model. Furthermore, the sales team used the score to make more data-driven decisions and save time by prioritizing the best quality leads. The accuracy of the trained Extreme Gradient Boosting algorithm to do the predictions proved to be a 13.45% improvement over the baseline model with a total accuracy of 0.94282 when tested on the test set. Lastly, all these tasks were put together as a pipeline and uploaded to a server inside HP to execute the process automatically every day with minimal human intervention. The pipeline developed proved to give very positive results for the organiza- tion and further developments are being made to enhance the results.
Note:	Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona, Any: 2020, Tutor: Xavier Llorens Latorre i Mariano Yagüez Insa
URI:	http://hdl.handle.net/2445/180483
Appears in Collections:	Màster Oficial - Fonaments de la Ciència de Dades

Files in This Item:

File	Description	Size	Format
tfm_sole_casaramona_jordi.pdf	Memòria	618.48 kB	Adobe PDF	View/Open

Show full item record

This item is licensed under a Creative Commons License