Identifying user clicks based on dependency graph

Identifying user clicks from a large number of measured HTTP requests is the fundamental task for web usage mining, which is important for web administrators and developers. Nowadays, the prevalent parallel web browsing behavior caused by multi-tab web browsers renders accurate user click identification from massive requests a great challenge. In this paper, we propose a dependency graph model to describe the complicated web browsing behavior. Based on this model, we develop two algorithms to establish the dependency graph for measured requests, and identify user clicks by comparing their probabilities of being primary requests with a self-learned threshold. We evaluate our method with a large dataset collected from a real world mobile core network. The experimental results show that our method can achieve high accurate user clicks identification.

[1]  Vivek S. Pai,et al.  Towards understanding modern web traffic , 2011, SIGMETRICS '11.

[2]  Jelena Mirkovic,et al.  Modeling Human Behavior for Defense Against Flash-Crowd Attacks , 2009, 2009 IEEE International Conference on Communications.

[3]  Ryen W. White,et al.  Parallel browsing behavior on the web , 2010, HT '10.

[4]  R. Krishnamoorthi,et al.  Identifying User Behavior by Analyzing Web Server Access Log File , 2009 .

[5]  Yongjian Fu,et al.  A Generalization-Based Approach to Clustering of Web Usage Sessions , 1999, WEBKDD.

[6]  Guofeng Zhao,et al.  A novel model for user clicks identification based on hidden semi-Markov , 2013, J. Netw. Comput. Appl..

[7]  Shun-Zheng Yu,et al.  A Large-Scale Hidden Semi-Markov Model for Anomaly Detection on User Browsing Behaviors , 2009, IEEE/ACM Transactions on Networking.

[8]  Yuchen Zhang,et al.  User-click modeling for understanding and predicting search-behavior , 2011, KDD.

[9]  G. Mardente,et al.  Web User-Session Inference by Means of Clustering Techniques , 2009, IEEE/ACM Transactions on Networking.

[10]  Olivier Chapelle,et al.  A dynamic bayesian network click model for web search ranking , 2009, WWW '09.

[11]  Zhenyu Liu,et al.  Automatic identification of user goals in Web search , 2005, WWW '05.

[12]  Chom Kimpan,et al.  Data Preprocessing on Web Server Log Files for Mining Users Access Patterns , 2012 .

[13]  S. Ramkumar A Web Usage Mining Framework for Mining Evolving User Profiles in Dynamic Web Sites , 2014 .

[14]  Brigitte Trousse,et al.  Advanced data preprocessing for intersites Web usage mining , 2004, IEEE Intelligent Systems.