Probabilistic user behavior models

We present a mixture model based approach for learning individualized behavior models for the Web users. We investigate the use of maximum entropy and Markov mixture models for generating probabilistic behavior models. We first build a global behavior model for the entire population and then personalize this global model for the existing users by assigning each user individual component weights for the mixture model. We then use these individual weights to group the users into behavior model clusters. We show that the clusters generated in this manner are interpretable and able to represent dominant behavior patterns. We conduct offline experiments on around two months worth of data from CiteSeer, an online digital library for computer science research papers currently storing more than 470,000 documents. We show that both maximum entropy and Markov based personal user behavior models are strong predictive models. We also show that maximum entropy based mixture model outperforms Markov mixture models in recognizing complex user behavior patterns.

[1]  Oren Etzioni,et al.  Adaptive Web Sites: Automatically Synthesizing Web Pages , 1998, AAAI/IAAI.

[2]  Usama M. Fayyad,et al.  Automating the Analysis and Cataloging of Sky Surveys , 1996, Advances in Knowledge Discovery and Data Mining.

[3]  David M. Pennock,et al.  Collaborative filtering with maximum entropy , 2004, IEEE Intelligent Systems.

[4]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[5]  Jaideep Srivastava,et al.  Automatic personalization based on Web usage mining , 2000, CACM.

[6]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7]  Joshua Goodman,et al.  Classes for fast maximum entropy training , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[8]  M. Wedel,et al.  Market Segmentation: Conceptual and Methodological Foundations , 1997 .

[9]  David M. Pennock,et al.  A Maximum Entropy Approach to Collaborative Filtering in Dynamic, Sparse, High-Dimensional Domains , 2002, NIPS.

[10]  David Heckerman,et al.  Bayesian Networks for Data Mining , 2004, Data Mining and Knowledge Discovery.

[11]  C. Lee Giles,et al.  Digital Libraries and Autonomous Citation Indexing , 1999, Computer.

[12]  Padhraic Smyth,et al.  Predictive Profiles for Transaction Data using Finite Mixture Models , 2001 .

[13]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[14]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[15]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[16]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[17]  Padhraic Smyth,et al.  Visualization of navigation patterns on a Web site using model-based clustering , 2000, KDD '00.

[18]  Jaideep Srivastava,et al.  Grouping Web page references into transactions for mining World Wide Web browsing patterns , 1997, Proceedings 1997 IEEE Knowledge and Data Engineering Exchange Workshop.

[19]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Joshua Goodman,et al.  Sequential Conditional Generalized Iterative Scaling , 2002, ACL.

[21]  David Maxwell Chickering,et al.  Dependency Networks for Inference, Collaborative Filtering, and Data Visualization , 2000, J. Mach. Learn. Res..

[22]  Maurice D. Mulvenna,et al.  Discovering Internet marketing intelligence through online analytical web usage mining , 1998, SGMD.

[23]  David Heckerman,et al.  Dependency Networks for Density Estimation, Collaborative Filtering, and Data Visualization , 2000 .

[24]  Stanley F. Chen,et al.  A Gaussian Prior for Smoothing Maximum Entropy Models , 1999 .

[25]  Jaideep Srivastava,et al.  Web usage mining: discovery and application of interesting patterns from web data , 2000 .

[26]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[27]  Dmitry Pavlov,et al.  Sequence modeling with mixtures of conditional maximum entropy distributions , 2003, Third IEEE International Conference on Data Mining.