A Comparison of Scoring Metrics for Predicting the Next Navigation Step with Markov Model-Based Systems

The problem of predicting the next request during a user's navigation session has been extensively studied. In this context, higher-order Markov models have been widely used to model navigation sessions and to predict the next navigation step, while prediction accuracy has been mainly evaluated with the hit and miss score. We claim that this score, although useful, is not sufficient for evaluating next link prediction models with the aim of finding a sufficient order of the model, the size of a recommendation set, and assessing the impact of unexpected events on the prediction accuracy. Herein, we make use of a variable length Markov model to compare the usefulness of three alternatives to the hit and miss score: the Mean Absolute Error, the Ignorance Score, and the Brier score. We present an extensive evaluation of the methods on real data sets and a comprehensive comparison of the scoring methods.

[1]  John G. Kemeny,et al.  Finite Markov Chains. , 1960 .

[2]  John G. Kemeny,et al.  Finite Markov chains , 1960 .

[3]  Mark Levene,et al.  Data Mining of User Navigation Patterns , 1999, WEBKDD.

[4]  Oren Etzioni,et al.  Towards adaptive Web sites: Conceptual framework and case study , 1999, Artif. Intell..

[5]  Jaideep Srivastava,et al.  Automatic personalization based on Web usage mining , 2000, CACM.

[6]  Ian Witten,et al.  Data Mining , 2000 .

[7]  Qiang Yang,et al.  WhatNext: a prediction system for Web requests using n-gram sequence models , 2000, Proceedings of the First International Conference on Web Information Systems Engineering.

[8]  M. Roulston,et al.  Evaluating Probabilistic Forecasts Using Information Theory , 2002 .

[9]  Junyi Shen,et al.  A new Markov model for Web access prediction , 2002, Comput. Sci. Eng..

[10]  E. Frías-Martínez A Prediction Model for User Access Sequences , 2002 .

[11]  Myra Spiliopoulou,et al.  A Framework for the Evaluation of Session Reconstruction Heuristics in Web-Usage Analysis , 2003, INFORMS J. Comput..

[12]  Xin Chen,et al.  A Popularity-Based Prediction Model for Web Prefetching , 2003, Computer.

[13]  Giovanni Squillero,et al.  Dynamic prediction of Web requests , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[14]  Nonparametric Convergence Assessment for MCMC Model Selection , 2003 .

[15]  Mathias Géry,et al.  Evaluation of web usage mining approaches for user's next request prediction , 2003, WIDM '03.

[16]  Ronald Fagin,et al.  Comparing top k lists , 2003, SODA '03.

[17]  Mark Hansen,et al.  Predicting Web Users' Next Access Based on Log Data , 2003 .

[18]  M. Tamer Özsu,et al.  A Web page prediction model based on click-stream tree representation of user behavior , 2003, KDD '03.

[19]  Arbee L. P. Chen,et al.  Prediction of Web Page Accesses by Proxy Server Log , 2002, World Wide Web.

[20]  Gill Bejerano Algorithms for variable length Markov chain modeling , 2004, Bioinform..

[21]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[22]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[23]  Brian D. Davison Learning Web Request Patterns , 2004, Web Dynamics.

[24]  Mark Levene,et al.  Web dynamics : adapting to change in content, size, topology and use , 2004 .

[25]  George Karypis,et al.  Selective Markov models for predicting Web page accesses , 2004, TOIT.

[26]  D. M. Hutton,et al.  Web Dynamics - Adapting to Change in Content, Size, Topology and Use , 2006 .

[27]  Makoto Abe A prediction model for web page transition , 2005, Int. J. Electron. Bus..

[28]  Kenneth McGarry,et al.  A survey of interestingness measures for knowledge discovery , 2005, The Knowledge Engineering Review.

[29]  Mark Levene,et al.  Generating Dynamic Higher-Order Markov Models in Web Usage Mining , 2005, PKDD.

[30]  Bhavani M. Thuraisingham,et al.  Predicting WWW surfing using multiple evidence combination , 2008, The VLDB Journal.

[31]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.

[32]  Hsinchun Chen,et al.  Intelligence and security informatics: information systems perspective , 2006, Decis. Support Syst..

[33]  Pearl Brereton,et al.  Website link prediction using a Markov chain model based on multiple time periods , 2007, Int. J. Web Eng. Technol..

[34]  Debajyoti Mukhopadhyay,et al.  An Agent Based Method for Web Page Prediction , 2007, KES-AMSTA.

[35]  Mark Levene,et al.  Testing the Predictive Power of Variable History Web Usage , 2007, Soft Comput..

[36]  Evangelos Theodoridis,et al.  A Web-Page Usage Prediction Scheme Using Weighted Suffix Trees , 2007, SPIRE.

[37]  Alfred Kobsa,et al.  The Adaptive Web, Methods and Strategies of Web Personalization , 2007, The Adaptive Web.

[38]  Ivan Koychev EXPERIMENTS WITH TWO APPROACHES FOR TRACKING DRIFTING CONCEPTS , 2007 .

[39]  Fabrizio Silvestri,et al.  Dynamic personalization of web sites without user intervention , 2007, CACM.

[40]  Mark Levene,et al.  Evaluating Variable-Length Markov Chain Models for Analysis of User Web Navigation Sessions , 2007, IEEE Transactions on Knowledge and Data Engineering.

[41]  Bamshad Mobasher,et al.  Data Mining for Web Personalization , 2007, The Adaptive Web.

[42]  Petr Berka,et al.  Predicting Page Occurrence in a Click-Stream Data: Statistical and Rule-Based Approach , 2007, ICDM.

[43]  Ning Zhong,et al.  Web Farming with Clickstream , 2008, Int. J. Inf. Technol. Decis. Mak..

[44]  Zhengxin Chen,et al.  A Descriptive Framework for the Field of Data Mining and Knowledge Discovery , 2008, Int. J. Inf. Technol. Decis. Mak..

[45]  Qingyu Zhang,et al.  Web Mining: a Survey of Current Research, Techniques, and Software , 2008, Int. J. Inf. Technol. Decis. Mak..

[46]  Heeseok Lee,et al.  Strategic Agent Based Web System Development Methodology , 2008, Int. J. Inf. Technol. Decis. Mak..

[47]  Raid Al-Aomar,et al.  A Customer-Oriented Decision Agent for Product Selection in Web-Based Services , 2008, Int. J. Inf. Technol. Decis. Mak..