Exploring Web Access Logs with Correspondence Analysis

During the interaction of Internet users with a website, users provide information about themselves and how they respond to the site's content: where they come from, which links they click, or even where they spend most of their time. All this information is stored in a log file or a database. In this paper we will demonstrate the capabilities offered by a data analysis method (Correspondence Analysis) on web log statistics for the examination of user behavior and preferences. Specifically, we observed log statistics of a university department web site an a monthly basis, by plotting each data set on the same factorial plane. We view that this process may produce valuable results both for web-content designers and institutions with Internet presence.

[1]  Rolph E. Anderson,et al.  Nederlandse samenvatting en bewerking van 'Multivariate data analysis, 4th Edition, 1995' , 1998 .

[2]  G. Jason Mathews,et al.  NSSDC OMNIWeb: The First Space Physics WWW-Based Data Browsing and Retrieval System , 1995, Comput. Networks ISDN Syst..

[3]  Alex G. Büchner Discovering Internet Marketing Intelligence through Web Log Mining , 2003 .

[4]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[5]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[6]  Margaret H. Dunham,et al.  Efficient mining of traversal patterns , 2001, Data Knowl. Eng..

[7]  AFC 97 : A New Software Implementation for Correspondence Analysis , .

[8]  Sourav S. Bhowmick,et al.  Research Issues in Web Data Mining , 1999, DaWaK.

[9]  Yannis Manolopoulos,et al.  Finding Generalized Path Patterns for Web Log Data Mining , 2000, ADBIS-DASFAA.

[10]  Jian Pei,et al.  Mining Access Patterns Efficiently from Web Logs , 2000, PAKDD.

[11]  Tao Luo,et al.  Integrating Web Usage and Content Mining for More Effective Personalization , 2000, EC-Web.

[12]  Ed H. Chi,et al.  The scent of a site: a system for analyzing and predicting information scent, usage, and usability of a Web site , 2000, CHI.

[13]  Jiawei Han,et al.  Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[14]  Gilbert Saporta,et al.  L'analyse des données , 1981 .

[15]  Lee J. White,et al.  Multivariate visualization in observation-based testing , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[16]  Yannis Manolopoulos,et al.  Mining patterns from graph traversals , 2001, Data Knowl. Eng..

[17]  Mark Gahegan,et al.  Scatterplots and scenes: visualisation techniques for exploratory spatial analysis , 1998 .

[18]  Bettina Berendt,et al.  Web Usage Mining, Site Semantics, and the Support of Navigation , 2000 .

[19]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[20]  Daniel A. Keim,et al.  On Knowledge Discovery and Data Mining , 1997 .

[21]  David Nicholas,et al.  Cracking the code: web log analysis , 1999, Online Inf. Rev..

[22]  Andy Cockburn,et al.  What do web users do? An empirical analysis of web use , 2001, Int. J. Hum. Comput. Stud..

[23]  Ron Kohavi,et al.  Mining e-commerce data: the good, the bad, and the ugly , 2001, KDD '01.