A mixed attributes oriented dynamic SOM fuzzy cluster algorithm for mobile user classification

Abstract In the field of mobile user behavior analysis, clustering algorithm is used to do the user classification. Usually the mobile user’s dataset is mixed, which contains both numerical and categorical type of data. It leads to inaccurate results when doing the user classification using traditional algorithms like K-Means and which are affected by the initialization process extremely. On the other hand, K-Prototypes algorithm is used to process the mixed data. It is difficult to ascertain the coefficient of classification attribute weight. Based on the above problems, this paper proposes a mixed attributes oriented dynamic SOM fuzzy cluster algorithm (D-SOMFCM-OMA) for mobile user classification. Firstly, the algorithm proposed in this paper gives the primary clustering using Self-Organizing feature Map (SOM) to get the initial clustering parameters. As the preprocessing of clustering, this step reduces effects caused by inappropriate initialization. Then, the algorithm utilizes the improved dynamic fuzzy K-Prototypes cluster method in the second-time clustering to classify users dynamically. It calculates weight of all kinds of attributes according to the proportion of the attribute and uses fine-tuned coefficient to adjust the attribute weight. For improving the clustering effect, the algorithm uses Jaccard distance to calculate the distance in the mixed attribute variables. Further, this paper defines the user mean membership threshold which is an indicator to determine whether different groups need to be added. Finally, some comparison experiments are conducted on the UCI standard datasets to show the advantage of the improved fuzzy clustering algorithm oriented the mixed attributes (IFCM-OMA). Also the other experiment using dataset of UCI verify the validity of the algorithm of D-SOMFCM-OMA.

[1]  Nikhil R. Pal,et al.  Clustering of Mixed Data by Integrating Fuzzy, Probabilistic, and Collaborative Clustering Framework , 2016, Int. J. Fuzzy Syst..

[2]  Casper Boks,et al.  A classification of user research methods for design for sustainable behaviour , 2015 .

[3]  Fang Liu,et al.  Characterizing User Behavior in Mobile Internet , 2015, IEEE Transactions on Emerging Topics in Computing.

[4]  Zhiqiang Ma,et al.  An Initialization Method for Clustering Mixed Numeric and Categorical Data Based on the Density and Distance , 2015, Int. J. Pattern Recognit. Artif. Intell..

[5]  Hui Zheng,et al.  A Survey of Mobile Internet Data Management: Models and Searching Methods , 2011 .

[6]  Weiming Shen,et al.  A user behavior prediction model based on parallel neural network and k-nearest neighbor algorithms , 2017, Cluster Computing.

[7]  Min Zhang,et al.  Improved Research to K-means Initial Cluster Centers , 2015, 2015 Ninth International Conference on Frontier of Computer Science and Technology.

[8]  Jukka Corander,et al.  Inferring Cognitive Models from Data using Approximate Bayesian Computation , 2016, CHI.

[9]  Moses Garuba,et al.  Big Data Analytics for User-Activity Analysis and User-Anomaly Detection in Mobile Wireless Network , 2017, IEEE Transactions on Industrial Informatics.

[10]  Neha Mahyavanshi,et al.  A Novel Idea for Credit Card Fraud Detection using Decision Tree , 2017 .

[11]  Saeid Nahavandi,et al.  Improving load forecast accuracy by clustering consumers using smart meter data , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[12]  Donald C. Wunsch,et al.  Clustering Data of Mixed Categorical and Numerical Type With Unsupervised Feature Learning , 2015, IEEE Access.

[13]  Reda Alhajj,et al.  Integrating SOM and fuzzy k-means clustering for customer classification in personalized recommendation system for non-text based transactional data , 2017, 2017 8th International Conference on Information Technology (ICIT).

[14]  Asif Afzal,et al.  Fuzzy c-Least Medians clustering for discovery of web access patterns from web user sessions data , 2017, Intell. Data Anal..

[15]  Abdolreza Mirzaei,et al.  An incremental mixed data clustering method using a new distance measure , 2015, Soft Comput..

[16]  Christian Wagner,et al.  From Interval-Valued Data to General Type-2 Fuzzy Sets , 2015, IEEE Transactions on Fuzzy Systems.

[17]  Igor Bisio,et al.  Enabling IoT for In-Home Rehabilitation: Accelerometer Signals Classification Methods for Activity and Movement Recognition , 2017, IEEE Internet of Things Journal.

[18]  P. V. G. D. Prasad Reddy,et al.  Cluster Analysis on Different Data Sets Using K-Modes and K-Prototype Algorithms , 2014 .