Birds of a Feather Surf Together: Using Clustering Methods to Improve Navigation Prediction from Internet Log Files

Many systems attempt to forecast user navigation in the Internet through the use of past behavior, preferences and environmental factors. Most of these models overlook the possibility that users may have many diverse sets of preferences. For example, the same person may search for information in different ways at night (when they are pursuing their hobbies and interests) as opposed to during the day (when they are at work). Thus, most users may well have different sets of preferences at different times of the day and behave differently in accordance with those preferences. In this paper, we present clustering methods for creating time dependent models to predict user navigation patterns; these methods allow us to segment log files into appropriate groups of navigation behaviour. The benefits of these methods over more established methods are highlighted. An empirical analysis is carried out on a sample of usage logs for Wireless Application Protocol (WAP) browsing as empirical support for the technique.

[1]  Barry Smyth,et al.  PTV: Intelligent Personalised TV Guides , 2000, AAAI/IAAI.

[2]  Eric Horvitz,et al.  Coordinates: Probabilistic Forecasting of Presence and Availability , 2002, UAI.

[3]  Richard M. Leahy,et al.  An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  David B. Cooper,et al.  Bayesian Clustering for Unsupervised Estimation of Surface and Texture Models , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Pedro M. Domingos,et al.  Adaptive Web Navigation for Wireless Devices , 2001, IJCAI.

[6]  Peter Pirolli,et al.  Distributions of surfers' paths through the World Wide Web: Empirical characterizations , 1999, World Wide Web.

[7]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[8]  Barry Smyth,et al.  Mobile web surfing is the same as web surfing , 2006, Commun. ACM.

[9]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[10]  John C. Tang,et al.  Rhythm modeling, visualizations and applications , 2003, UIST '03.

[11]  Eric Horvitz,et al.  Patterns of search: analyzing and modeling Web query refinement , 1999 .

[12]  Oren Etzioni,et al.  The World-Wide Web: quagmire or gold mine? , 1996, CACM.

[13]  Jakob Nielsen,et al.  wap usability deja vu: 1994 all over again , 2000 .

[14]  Myra Spiliopoulou,et al.  The Laborious Way From Data Mining to Web Log Mining , 1999 .

[15]  Ophir Frieder,et al.  Hourly analysis of a very large topically categorized web query log , 2004, SIGIR '04.

[16]  Michael J. Pazzani,et al.  Adaptive interfaces for ubiquitous web access , 2002, CACM.

[17]  Jun Hong,et al.  Using Markov models for web site link prediction , 2002, HYPERTEXT '02.

[18]  Henry Lieberman,et al.  Letizia: An Agent That Assists Web Browsing , 1995, IJCAI.

[19]  Barry Smyth,et al.  Predicting navigation patterns on the mobile-internet using time of the week , 2005, WWW '05.

[20]  Barry Smyth,et al.  The Plight of the Navigator: Solving the Navigation Problem for Wireless Portals , 2002, AH.