Auto Insurance Business Analytics Approach for Customer Segmentation Using Multiple Mixed-Type Data Clustering Algorithms

Customer segmentation is critical for auto insurance companies to gain competitive advantage by mining useful customer related information. While some efforts have been made for customer segmentation to support auto insurance decision making, their customer segmentation results tend to be affected by the characteristics of the algorithm used and lack multiple validation from multiple algorithms. To this end, we propose an auto insurance business analytics approach that segments customers by using three mixed-type data clustering algorithms including k-prototypes, improved k-prototypes and similarity-based agglomerative clustering. The customer segmentation results of these algorithms can complement and reinforce each other and demonstrate as much information as possible to support decision-making. To confirm its practical value, the proposed approach extracts seven rules for an auto insurance company that may support the company to make customer related decisions and develop insurance products.

[1]  Chen Hong,et al.  Clustering Algorithm for Incomplete Data Sets with Mixed Numeric and Categorical Attributes , 2013 .

[2]  Ahad Zare Ravasan,et al.  A Fuzzy ANP Based Weighted RFM Model for Customer Segmentation in Auto Insurance Sector , 2015, Int. J. Inf. Syst. Serv. Sect..

[3]  Yixiao Li,et al.  Clustering Mixed Data Based on Evidence Accumulation , 2006, ADMA.

[4]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[5]  Tanachapong Wangchamhan,et al.  Efficient algorithms based on the k-means and Chaotic League Championship Algorithm for numeric, categorical, and mixed-type data clustering , 2017, Expert Syst. Appl..

[6]  Andrew Skabar Clustering Mixed-Attribute Data using Random Walk , 2017, ICCS.

[7]  Yao Wang,et al.  A robust and scalable clustering algorithm for mixed type attributes in large database environment , 2001, KDD '01.

[8]  Rekha Bhowmik,et al.  Detecting Auto Insurance Fraud by Data Mining Techniques , 2011 .

[9]  Hong Jia,et al.  Categorical-and-numerical-attribute data clustering based on a unified similarity metric without knowing cluster number , 2013, Pattern Recognit..

[10]  Suyeon Kang,et al.  Feature selection for continuous aggregate response and its application to auto insurance data , 2018, Expert Syst. Appl..

[11]  Mihaela David,et al.  Auto Insurance Premium Calculation Using Generalized Linear Models , 2015 .

[12]  Yu Xue,et al.  A novel density peaks clustering algorithm for mixed data , 2017, Pattern Recognit. Lett..

[13]  Ljiljana Kascelan,et al.  A Data Mining Approach for Risk Assessment in Car Insurance: Evidence from Montenegro , 2014, Int. J. Bus. Intell. Res..

[14]  Gautam Biswas,et al.  Unsupervised Learning with Mixed Numeric and Nominal Data , 2002, IEEE Trans. Knowl. Data Eng..

[15]  T. Coleman,et al.  Auto insurance fraud detection using unsupervised spectral ranking for anomaly , 2016 .

[16]  Katerina Pramatari,et al.  Retail business analytics: Customer visit segmentation using market basket data , 2018, Expert Syst. Appl..

[17]  Payam Hanafizadeh,et al.  A Data Mining Model for Risk Assessment and Customer Segmentation in the Insurance Industry , 2013, Int. J. Strateg. Decis. Sci..

[18]  Liviu Ilies,et al.  Customer Relationship Management in the Insurance Industry , 2014 .

[19]  Zhexue Huang,et al.  CLUSTERING LARGE DATA SETS WITH MIXED NUMERIC AND CATEGORICAL VALUES , 1997 .

[20]  D. W. Goodall A New Similarity Index Based on Probability , 1966 .

[21]  Ming-Syan Chen,et al.  On Data Labeling for Clustering Categorical Data , 2008, IEEE Transactions on Knowledge and Data Engineering.