Clustering Model Based on Web Behavior

Web log mining is an emerging part of data mining. It provides invaluable information by discovering trends and regularities in web user's access patterns. Clustering based on access pattern is an important research topic of web usage mining. Knowledge obtained from web user clusters has been used in different fields of web mining technologies. This paper presents an algorithm for measuring similarities and automated segmentation of web users based on their past access patterns. The compatibility measures are based on content extracted from user's browser data. Furthermore it also provides a locality based clustering method for the people who are unknown to their most compatible friends. Keywords—Cluster, Model, Web, Activity, Report, Interest, Location, Compatibility, Matrix, Data Mining, Fuzzy Clustering, GeoIP , LDA, Sessions, History, Snippet, Links, IP.

[1]  Cyrus Shahabi,et al.  Knowledge discovery from users Web-page navigation , 1997, Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications.

[2]  ChengXiang Zhai,et al.  Mining long-lasting exploratory user interests from search history , 2012, CIKM.

[3]  Yanchun Zhang,et al.  Clustering of web users using session-based similarity measures , 2001, Proceedings 2001 International Conference on Computer Networks and Mobile Computing.

[4]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[5]  Giovanna Castellano,et al.  Similarity-Based Fuzzy Clustering for User Profiling , 2007, 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops.

[6]  Susumu Horiguchi,et al.  Learning to classify short and sparse text & web with hidden topics from large-scale data collections , 2008, WWW.