Analytical method of web user behavior using Hidden Markov Model

We propose a new analytical method to classify web user behavior based on such latent states of users as intention, interest, or motivation. First, we put the clickstream data of many users into a Hidden Markov Model in which the number of hidden states is large enough to build a state transition network. Since the variable hidden states represent different latent states of users, the movement on the state transition network can represent user behavior. Second, we divide each piece of clickstream data into sessions, which we classify using network movement as feature values. These cluster labels represent the latent states of users during their stay in the web service. In this paper, we applied our method to the data of a social network game named Girl Friend BETA, which is an online game that is mainly provided on social networking services. We observed the following hidden states that represent the variable latent states of users, such as enthusiasm for the main contents of the service, playing basic content, and daily routines that are well observed by visiting the service: e.g. receiving login bonuses. Also, we classified the sessions by the latent states of users, such as light user sessions, low motivation sessions, and sessions in which users seem addicted to the main contents.

[1]  S. L. Scott,et al.  A Nested Hidden Markov Model for Internet Browsing Behavior , 2005 .

[2]  Kannan Srinivasan,et al.  Modeling Online Browsing and Path Analysis Using Clickstream Data , 2004 .

[3]  Kwan-Liu Ma,et al.  Visual cluster exploration of web clickstream data , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[4]  Tae-Seong Kim,et al.  Facial Image Retrieval through Compound Queries Using Constrained Independent Component Analysis , 2007 .

[5]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[6]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Jiawei Han,et al.  Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[8]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[9]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[11]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[12]  Sergio Gómez,et al.  Size reduction of complex networks preserving modularity , 2007, ArXiv.

[13]  Takayuki Itoh,et al.  A Visualization Technique for Access Patterns and Link Structures of Web Sites , 2010, 2010 14th International Conference Information Visualisation.

[14]  Marie-Jeanne Lesot,et al.  A New Web Usage Mining and Visualization Tool , 2007, 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007).

[15]  Jian Pei,et al.  Mining Access Patterns Efficiently from Web Logs , 2000, PAKDD.

[16]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[17]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.