Carregant...
Tipus de document
Treball de fi de grauData de publicació
Llicència de publicació
Si us plau utilitzeu sempre aquest identificador per citar o enllaçar aquest document: https://hdl.handle.net/2445/171904
Clústers amb variables mixtes per a la caracterització de clients
Títol de la revista
Autors
Director/Tutor
ISSN de la revista
Títol del volum
Recurs relacionat
Resum
[cat] L’anàlisi de conglomerats és un mètode multivariant que té com a objectiu principal
identificar grups d’objectes amb característiques similars dins d’una base de dades
numèriques. Actualment però, aquesta branca de l’estadística està desenvolupant mètodes
que permetin l’anàlisi de bases de dades mixtes, per tal de poder utilitzar tant les variables
descriptives numèriques com les categòriques dels diversos objectes. Aquests criteris
d’agrupació es poden classificar en dos grans grups: els mètodes jeràrquics i els no
jeràrquics.
En el següent treball es realitza una clusterització de les dades dels clients d’un majorista de
ferreteria industrial a fi de poder-los agrupar en varis grups homogenis, mitjançant dos
mètodes d’agrupació: el mètode de Ward i el Partition Around Medoids. A fi de poder crear
aquests grups és necessari calcular un coeficient de similitud per tal de conèixer les
distàncies entre els individus. Així doncs, s’utilitzarà el coeficient de Gower, ja que permet
tractar amb dades numèriques i categòriques a la vegada. No obstant, també es realitzarà un
anàlisi de sensibilitat d’aquesta mesura per tal de comprovar la seva robustesa.
[eng] Traditionally, the cluster analysis has only been used in numerical data bases with the main objective being the identification of object groups with similar characteristics within those databases. This is an on-growing branch of statistics science that tries to incorporate methods that allow us to perform this mix database object categorization. In this project we will be carrying out a cluster analysis of an industrial hardware wholesaler’s client information. By doing so, we can recognize those that follow specific patterns of behaviour, which will allow us to perform a more accurate follow-up and segment the price cut campaigns according to their own interests. For us to accomplish this objective, we have opted for the use of both a hierarchical aggregation method and a non-hierarchical one: the Ward and the Partition Around Medoids methods, respectively. However, for us to be able to create these groups we need to calculate a similarity coefficient for us to know the distances between the various individuals. In this study, we have opted for the use of the Grower coefficient, seeing as it allows us to handle numerical and categorical data at the same time. Hereunder is the detailed explanation of the client characterisation process, starting with the creation of the database and following up with the description of the various groups used. A Grower distance sensibility analysis is also included for the validation of the solidness of this measure.
[eng] Traditionally, the cluster analysis has only been used in numerical data bases with the main objective being the identification of object groups with similar characteristics within those databases. This is an on-growing branch of statistics science that tries to incorporate methods that allow us to perform this mix database object categorization. In this project we will be carrying out a cluster analysis of an industrial hardware wholesaler’s client information. By doing so, we can recognize those that follow specific patterns of behaviour, which will allow us to perform a more accurate follow-up and segment the price cut campaigns according to their own interests. For us to accomplish this objective, we have opted for the use of both a hierarchical aggregation method and a non-hierarchical one: the Ward and the Partition Around Medoids methods, respectively. However, for us to be able to create these groups we need to calculate a similarity coefficient for us to know the distances between the various individuals. In this study, we have opted for the use of the Grower coefficient, seeing as it allows us to handle numerical and categorical data at the same time. Hereunder is the detailed explanation of the client characterisation process, starting with the creation of the database and following up with the description of the various groups used. A Grower distance sensibility analysis is also included for the validation of the solidness of this measure.
Descripció
Treballs Finals de Grau en Estadística UB-UPC, Facultat d'Economia i Empresa (UB) i Facultat de Matemàtiques i Estadística (UPC), Curs: 2019-2020. Tutors: Ignasi Puig De Dou; Lourdes Rodero De Lamo
Matèries (anglès)
Citació
Citació
CARBONELL CABUTÍ, Marta. Clústers amb variables mixtes per a la caracterització de clients. [consulta: 11 de abril de 2026]. [Disponible a: https://hdl.handle.net/2445/171904]