Web Usage Mining

In recent years, e-businesses have been profiting from recent advances on the analysis of web customer behaviour. For decades experts have debated on ways of presenting the content or structure in a web site in order to captivate the attention of the web user in the web intelligence community. A solution to this could help boost sales in an e-commerce site. Web Usage Mining (WUM) is the extraction of the web user browsing behaviour using data mining techniques on web data. According to this, several models of data analysis have been used to characterize the Web User Browsing Behaviour. Nevertheless, outstanding techniques have recently developed in order to improve the conventional success rates for behavioural pattern extraction. In this chapter different approaches for WUM are presented, considering their main insights, results, and applications to web behaviour systems.

[1]  Christopher Olston,et al.  ScentTrails: Integrating browsing and searching on the Web , 2003, TCHI.

[2]  Tao Luo,et al.  Discovery and Evaluation of Aggregate Usage Profiles for Web Personalization , 2004, Data Mining and Knowledge Discovery.

[3]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[4]  Xin Chen,et al.  A Popularity-Based Prediction Model for Web Prefetching , 2003, Computer.

[5]  Domonkos Tikk,et al.  Major components of the gravity recommendation system , 2007, SKDD.

[6]  Myra Spiliopoulou,et al.  WUM - A Tool for WWW Ulitization Analysis , 1998, WebDB.

[7]  Pier Luca Lanzi,et al.  Recent Developments in Web Usage Mining Research , 2003, DaWaK.

[8]  Mark Levene,et al.  Data Mining of User Navigation Patterns , 1999, WEBKDD.

[9]  Yanchun Zhang,et al.  Measuring similarity of interests for clustering Web-users , 2001, Proceedings 12th Australasian Database Conference. ADC 2001.

[10]  Pierre Baldi,et al.  Modeling the Internet and the Web: Probabilistic Methods and Algorithms. By Pierre Baldi, Paolo Frasconi, Padhraic Smith, John Wiley and Sons Ltd., West Sussex, England, 2003. 285 pp ISBN 0 470 84906 1 , 2006, Inf. Process. Manag..

[11]  Jaideep Srivastava,et al.  Automatic personalization based on Web usage mining , 2000, CACM.

[12]  Alexei Vázquez,et al.  Exact results for the Barabási model of human dynamics. , 2005, Physical review letters.

[13]  Pablo E. Román,et al.  Web User Session Reconstruction Using Integer Programming , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[14]  Tao Luo,et al.  Effective personalization based on association rule discovery from web usage data , 2001, WIDM '01.

[15]  Ingrid Zukerman,et al.  Predicting users' requests on the WWW , 1999 .

[16]  Jiming Liu,et al.  Characterizing Web usage regularities with information foraging agents , 2004, IEEE Transactions on Knowledge and Data Engineering.

[17]  Sebastián A. Ríos,et al.  Semantic Web Usage Mining by a Concept-Based Approach for Off-line Web Site Enhancements , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[18]  Michalis Vazirgiannis,et al.  Web path recommendations based on page ranking and Markov models , 2005, WIDM '05.

[19]  Padhraic Smyth,et al.  Visualization of navigation patterns on a Web site using model-based clustering , 2000, KDD '00.

[20]  Zhihai Wang,et al.  Improving Classification Performance by Combining Multiple TANClassifiers , 2003, RSFDGrC.

[21]  Albert-László Barabási,et al.  Modeling bursts and heavy tails in human dynamics , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[23]  Juan D. Velásquez,et al.  Design and Implementation of a Methodology for Identifying Website Keyobjects , 2009, KES.

[24]  Thomas A. Runkler,et al.  Web mining with relational clustering , 2003, Int. J. Approx. Reason..

[25]  Steven Glassman,et al.  A Caching Relay for the World Wide Web , 1994, Comput. Networks ISDN Syst..

[26]  Jon M. Kleinberg,et al.  Fast Algorithms for Large-State-Space HMMs with Applications to Web Usage Analysis , 2003, NIPS.

[27]  S. Resnick Adventures in stochastic processes , 1992 .

[28]  Kotagiri Ramamohanarao,et al.  Web Page Prediction Based on Conditional Random Fields , 2008, ECAI.

[29]  Yuming Zhou,et al.  MNav: A Markov Model-Based Web Site Navigability Measure , 2007, IEEE Transactions on Software Engineering.

[30]  Till Plumbaum,et al.  Semantic Web Usage Mining: Using Semantics to Understand User Intentions , 2009, UMAP.

[31]  Huberman,et al.  Strong regularities in world wide web surfing , 1998, Science.

[32]  Andreas Hotho,et al.  Towards Semantic Web Mining , 2002, SEMWEB.

[33]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[34]  Mark Hansen,et al.  Predicting Web Users' Next Access Based on Log Data , 2003 .

[35]  Jaideep Srivastava,et al.  Creating adaptive Web sites through usage-based clustering of URLs , 1999, Proceedings 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX'99) (Cat. No.PR00453).

[36]  Eelco Herder,et al.  Web page revisitation revisited: implications of a long-term click-stream study of browser usage , 2007, CHI.

[37]  Jintao Li,et al.  Adaptive Online Retail Web Site Based on Hidden Markov Model , 2000, Web-Age Information Management.

[38]  V. Palade,et al.  Adaptive Web Sites - A Knowledge Extraction from Web Data Approach , 2008, Frontiers in Artificial Intelligence and Applications.

[39]  Radek Burget,et al.  Web Page Element Classification Based on Visual Features , 2009, 2009 First Asian Conference on Intelligent Information and Database Systems.

[40]  Maguelonne Teisseire,et al.  Web usage mining: extracting unexpected periods from web logs , 2005, Data Mining and Knowledge Discovery.

[41]  Nematollaah Shiri,et al.  A session generalization technique for improved web usage mining , 2009, WIDM.

[42]  Tzung-Pei Hong,et al.  A practical extension of web usage mining with intentional browsing data toward usage , 2009, Expert Syst. Appl..

[43]  Geert Wets,et al.  Mining Navigation Patterns Using a Sequence Alignment Method , 2004, Knowl. Inf. Syst..

[44]  Oren Etzioni,et al.  Towards adaptive Web sites: Conceptual framework and case study , 2000, Artif. Intell..

[45]  Anupam Joshi,et al.  On Mining Web Access Logs , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[46]  Alexander Mikroyannidis,et al.  Heraclitus: A Framework for Semantic Web Adaptation , 2007, IEEE Internet Computing.

[47]  Pablo E. Román,et al.  Analysis of the Web User Behavior with a Psychologically-Based Diffusion Model , 2009, AAAI Fall Symposium: Biologically Inspired Cognitive Architectures.

[48]  Ricardo A. Baeza-Yates,et al.  Crawling the Infinite Web: Five Levels Are Enough , 2004, WAW.

[49]  Torben Bach Pedersen,et al.  Evaluating the markov assumption for web usage mining , 2003, WIDM '03.

[50]  Myra Spiliopoulou,et al.  A Framework for the Evaluation of Session Reconstruction Heuristics in Web-Usage Analysis , 2003, INFORMS J. Comput..

[51]  Terumasa Aoki,et al.  A New Similarity Measure to Understand Visitor Behavior in a Web Site , 2004, IEICE Trans. Inf. Syst..

[52]  Terumasa Aoki,et al.  Intelligent Web Site: Understanding the Visitor Behavior , 2004, KES.

[53]  Markus Jakobsson,et al.  Cache cookies for browser authentication , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[54]  Padmapriya Ayyagari,et al.  Modeling the Internet and the Web: Probabilistic Methods and Algorithms. By Pierre Baldi, Paolo Frasconi, Padhraic Smith, John Wiley and Sons Ltd., West Sussex, England, 2003. 285 pp ISBN 0 470 84906 1 , 2006, Inf. Process. Manag..

[55]  Peter Ingwersen,et al.  The Turn - Integration of Information Seeking and Retrieval in Context , 2005, The Kluwer International Series on Information Retrieval.

[56]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[57]  Christos Faloutsos,et al.  Graph mining: Laws, generators, and algorithms , 2006, CSUR.

[58]  Shen Jun-yi,et al.  A new Markov model for Web access prediction , 2002 .

[59]  Václav Snásel,et al.  Web Content Mining Focused on Named Objects , 2009, IHCI.

[60]  Pablo E. Román,et al.  A Dynamic Stochastic Model Applied to the Analysis of the Web User Behavior , 2010 .

[61]  A. Sima Etaner-Uyar,et al.  Graph-based sequence clustering through multiobjective evolutionary algorithms for web recommender systems , 2007, GECCO '07.

[62]  M. Tamer Özsu,et al.  A Web page prediction model based on click-stream tree representation of user behavior , 2003, KDD '03.

[63]  Georgios Kambourakis,et al.  Enhancing User Privacy in Adaptive Web Sites with Client-Side User Profiles , 2008, 2008 Third International Workshop on Semantic Media Adaptation and Personalization.

[64]  Wallace A. Pinheiro,et al.  Using Wavelets to Classify Documents , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[65]  Nenad Stojanovic,et al.  On Enriching Ajax with Semantics: The Web Personalization Use Case , 2007, ESWC.

[66]  Hiroshi Ando,et al.  Psychodynamic Appraisal Mechanism for Emotional Development through Multi-modal Interaction , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[67]  Melanie Kellar,et al.  An examination of user behaviour during web information tasks , 2006, CHI Extended Abstracts.

[68]  Jian Su,et al.  Supervised and Traditional Term Weighting Methods for Automatic Text Categorization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[69]  Mark Levene,et al.  Evaluating Variable-Length Markov Chain Models for Analysis of User Web Navigation Sessions , 2007, IEEE Transactions on Knowledge and Data Engineering.

[70]  Liu Xie,et al.  Adaptive Site Design Based on Web Mining and Topology , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[71]  Jing Shi,et al.  User's Interests Navigation Model Based on Hidden Markov Model , 2003, RSFDGrC.

[72]  Mark Levene,et al.  Zipf's Law for Web Surfers , 2001, Knowledge and Information Systems.

[73]  Kotagiri Ramamohanarao,et al.  Grouped ECOC Conditional Random Fields for Prediction of Web User Behavior , 2009, PAKDD.

[74]  Tzung-Pei Hong,et al.  Web usage mining with intentional browsing data , 2008, Expert Syst. Appl..

[75]  Avinash Kaushik,et al.  Web Analytics 2.0: The Art of Online Accountability and Science of Customer Centricity , 2009 .

[76]  Ryen W. White,et al.  WWW 2007 / Track: Browsers and User Interfaces Session: Personalization Investigating Behavioral Variability in Web Search , 2022 .

[77]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[78]  Terumasa Aoki,et al.  A methodology to find Web site keywords , 2004, IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004. EEE '04. 2004.

[79]  M. HamidR.Jamali,et al.  Website usage metrics: A re-assessment of session data , 2008, Inf. Process. Manag..

[80]  K. Thangavel,et al.  Rough Set Based Feature Selection for Web Usage Mining , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[81]  D. Langford Internet Ethics , 2003 .

[82]  Xin Jin,et al.  Web usage mining based on probabilistic latent semantic analysis , 2004, KDD.

[83]  Sungjune Park,et al.  Sequence-based clustering for Web usage mining: A new experimental framework and ANN-enhanced K-means algorithm , 2008, Data Knowl. Eng..

[84]  Ramesh R. Sarukkai,et al.  Link prediction and path analysis using Markov chains , 2000, Comput. Networks.

[85]  Robert E. Bixby,et al.  Solving Real-World Linear Programs: A Decade and More of Progress , 2002, Oper. Res..

[86]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.