Clustering optimization in RFM analysis Based on k-Means

RFM stands for Recency, Frequency, and Monetary. RFM is a simple but effective method that can be applied to market segmentation. RFM analysis is used to analyze customer’s behavior which consists of how recently the customers have purchased (recency), how often customer’s purchases (frequency), and how much money customers spend (monetary). In this study, RFM analysis has been used for product segmentation is to be arrayed in terms of recent sales (R), frequent sales (F), and the total money spent (M) using the data mining method. This study has proposed a new procedure for RFM analysis (in product segmentation) using the k-Means method and eight indexes of validity to determine the optimal number of clusters namely Elbow Method, Silhouette Index, Calinski-Harabasz Index, Davies-Bouldin Index, Ratkowski Index, Hubert Index, Ball-Hall Index, and Krzanowski-Lai Index, which can improve the objectivity and similarity of data in product segmentation so that it can improve the accuracy of the stock management process. The evaluation results showed that the optimal number of clusters for the k-Means method applied in the RFM analysis consists of three clusters (segmentation) with a variance value of 0.19113.

[1]  B D Satoto,et al.  Integration K-Means Clustering Method and Elbow Method For Identification of The Best Customer Profile Cluster , 2018, IOP Conference Series: Materials Science and Engineering.

[2]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[3]  Babak Hazaveh Hesar Maskan Proposing a Model for Customer Segmentation using WRFM Analysis (Case Study: an ISP Company) - TI Journals , 2014 .

[4]  D. I. Sensuse,et al.  Decision support system for inventory management in pharmacy using fuzzy analytic hierarchy process and sequential pattern analysis approach , 2015, 2015 3rd International Conference on New Media (CONMEDIA).

[5]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  W. Krzanowski,et al.  A Criterion for Determining the Number of Groups in a Data Set Using Sum-of-Squares Clustering , 1988 .

[7]  R. Gustriansyah,et al.  The Design of UML-Based Sales Forecasting Application 1508 , 2019 .

[8]  Dana Indra Sensuse,et al.  A Sales Prediction Model Adopted the Recency-Frequency-Monetary Concept , 2017 .

[9]  Hsin-Hung Wu,et al.  The application of data mining and RFM model in market segmentation of a veterinary hospital , 2019, Journal of Statistics and Management Systems.

[10]  Spring C. Hsu The RFM-based Institutional Customers Clustering: Case Study of a Digital Content Provider , 2012 .

[11]  Yen-Liang Chen,et al.  Discovering recency, frequency, and monetary (RFM) sequential patterns from customers' purchasing data , 2009, Electron. Commer. Res. Appl..

[12]  Ito Wasito,et al.  Kernel based integration of Gene expression and DNA copy number , 2013, 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS).

[13]  M. Emre Celebi,et al.  Unsupervised Learning Algorithms , 2016 .

[14]  Kimito Funatsu,et al.  Knowledge-Oriented Applications in Data Mining , 2011 .

[15]  Belaid Bouikhalene,et al.  Combining RFM model and clustering techniques for customer value analysis of a company selling online , 2015, 2015 IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA).

[16]  Songfeng Lu,et al.  K-means algorithm with level set for brain tumor segmentation , 2019 .

[17]  Derya Birant Data Mining Using RFM Analysis , 2011 .

[18]  RICHARD C. DUBES,et al.  How many clusters are best? - An experiment , 1987, Pattern Recognit..

[19]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[20]  Hsin-Hung Wu,et al.  Applying Data Mining and RFM Model to Analyze Customers' Values of a Veterinary Hospital , 2016, 2016 International Symposium on Computer, Consumer and Control (IS3C).

[21]  Robert S. Hill,et al.  A Stopping Rule for Partitioning Dendrograms , 1980, Botanical Gazette.

[22]  Hsin-Hung Wu,et al.  Applying RFM model and K-means method in customer value analysis of an outfitter , 2009 .

[23]  Adam Krzyzak,et al.  Performance Evaluation of the Silhouette Index , 2015, ICAISC.

[24]  S. Krishna Mohan Rao,et al.  A Method to Find Optimum Number of Clusters Based on Fuzzy Silhouette on Dynamic Data Set , 2015 .

[25]  Milos Radovanovic,et al.  Clustering Evaluation in High-Dimensional Data , 2019, EDML@SDM.

[26]  Md. Zakir Hossain,et al.  A dynamic K-means clustering for data mining , 2019 .