Clustering of Web Users Based on Matrix of Influence Degree

Clustering of web users is an important research field in web mining. Information of web user clusters have been wildly used in many applications, such as solution of website structure optimization, reconstruction of website and distribution of advertising business. In this paper, we convert web log data into a sparse matrix, and propose a novel approach to calculate influence degree of each web page for all web users to build a Matrix of Influence Degree (MID) according to the generated sparse matrix, we can cluster web users simply from the generated MID. In the experiments, the results show that our proposed approach is capable of being the basic of clustering web users in web log data.

[1]  Jiebo Luo,et al.  Image segmentation via adaptive K-mean clustering and knowledge-based morphological operations with biomedical applications , 1998, IEEE Trans. Image Process..

[2]  Yudong Chen,et al.  Clustering Partially Observed Graphs via Convex Optimization , 2011, ICML.

[3]  Yanchun Zhang,et al.  Clustering of web users using session-based similarity measures , 2001, Proceedings 2001 International Conference on Computer Networks and Mobile Computing.

[4]  Keun Ho Ryu,et al.  Prediction of Web User Behavior by Discovering Temporal Relational Rules from Web Log Data , 2012, DEXA.

[5]  Maurice K. Wong,et al.  Algorithm AS136: A k-means clustering algorithm. , 1979 .

[6]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[7]  Zoran Mitrovic,et al.  Discovering Interesting Association Rules in the Web Log Usage Data , 2010 .

[8]  S Neumann,et al.  RAMClust: a novel feature clustering method enables spectral-matching-based annotation for metabolomics data. , 2014, Analytical chemistry.

[9]  Zahir Tari,et al.  A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis , 2014, IEEE Transactions on Emerging Topics in Computing.

[10]  Simone Vantini,et al.  K-mean Alignment for Curve Clustering , 2010, Comput. Stat. Data Anal..

[11]  Maurice D. Mulvenna,et al.  Personalization on the Net using Web mining: introduction , 2000, CACM.

[12]  Keun Ho Ryu,et al.  MapReduce-based web mining for prediction of web-user navigation , 2014, J. Inf. Sci..

[13]  C. Kailasanathan,et al.  Image Authentication Surviving Acceptable Modications Using Statistical Measures and K-mean Segmentation , 2001 .

[14]  Lipika Dey,et al.  A k-mean clustering algorithm for mixed numeric and categorical data , 2007, Data Knowl. Eng..

[15]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[16]  Alfred O. Hero,et al.  Clustering with a new distance measure based on a dual-rooted tree , 2013, Inf. Sci..

[17]  Pawan Lingras,et al.  Interval Set Clustering of Web Users with Rough K-Means , 2004, Journal of Intelligent Information Systems.

[19]  Keun Ho Ryu,et al.  A novel approach to mining access patterns , 2011, 2011 3rd International Conference on Awareness Science and Technology (iCAST).

[20]  Kwang Deuk Kim,et al.  Application of Closed Gap-Constrained Sequential Pattern Mining in Web Log Data , 2012 .

[21]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .