Using incremental Web log mining to create adaptive web servers

Personalization of content returned from a Web site is an important problem in general and affects e-commerce and e-services in particular. Targeting appropriate information or products to the end user can significantly change (for the better) the user experience on a Web site. One possible approach to Web personalization is to mine typical user profiles from the vast amount of historical data stored in access logs. We present a system that mines the logs to obtain profiles and uses them to automatically generate a Web page containing URLs the user might be interested in. Profiles generated are only based on the prior traversal patterns of the user on the Web site and do not involve providing any declarative information or require the user to log in. Profiles are dynamic in nature. With time, a user’s traversal pattern changes. To reflect changes to the personalized page generated for the user, the profiles have to be regenerated, taking into account the existing profile. Instead of creating a new profile, we incrementally add and/or remove information from a user profile, aiming to save time as well as physical memory requirements.

[1]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[2]  Hans-Peter Kriegel,et al.  Incremental Clustering for Mining in a Data Warehousing Environment , 1998, VLDB.

[3]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[4]  King-Sun Fu,et al.  Syntactic Pattern Recognition And Applications , 1968 .

[5]  Oren Etzioni,et al.  Adaptive Web Sites: an AI Challenge , 1997, IJCAI.

[6]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[7]  Anupam Joshi,et al.  On Mining Web Access Logs , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[8]  Thorsten Joachims,et al.  WebWatcher : A Learning Apprentice for the World Wide Web , 1995 .

[9]  K. Chidananda Gowda,et al.  Symbolic clustering using a new similarity measure , 1992, IEEE Trans. Syst. Man Cybern..

[10]  Anupam Joshi,et al.  Low-complexity fuzzy relational clustering algorithms for Web mining , 2001, IEEE Trans. Fuzzy Syst..

[11]  Anupam Joshi,et al.  Personalization & Asynchronicity to Support Mobile Web Access , 1998, Workshop on Web Information and Data Management.

[12]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[13]  Alberto O. Mendelzon,et al.  WebOQL: restructuring documents, databases and Webs , 1998, Proceedings 14th International Conference on Data Engineering.

[14]  Vipin Kumar,et al.  Discovery of Web Robot Sessions Based on their Navigational Patterns , 2004, Data Mining and Knowledge Discovery.

[15]  Cyrus Shahabi,et al.  Analysis and design of server informative WWW-sites , 1997, CIKM '97.

[16]  Arun N. Swami,et al.  Clustering Data Without Distance Functions , 1998, IEEE Data Eng. Bull..

[17]  Oren Etzioni,et al.  Adaptive Web Sites: Automatically Synthesizing Web Pages , 1998, AAAI/IAAI.

[18]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[19]  Mohamed A. Ismail,et al.  Fuzzy clustering for symbolic data , 1998, IEEE Trans. Fuzzy Syst..

[20]  Elias N. Houstis,et al.  On disconnected browsing of distributed information , 1997, Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications.

[21]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[22]  Jiawei Han,et al.  WebML: Querying the World-Wide Web for Resources and Knowledge , 1998, Workshop on Web Information and Data Management.

[23]  R.J. Hathaway,et al.  Switching regression models and fuzzy clustering , 1993, IEEE Trans. Fuzzy Syst..

[24]  Anupam Joshi,et al.  Retriever: Improving Web Search Engine Results Using Clustering , 2000 .

[25]  James M. Keller,et al.  A possibilistic approach to clustering , 1993, IEEE Trans. Fuzzy Syst..

[26]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[27]  R. Sokal,et al.  Numerical Taxonomy: The Principles and Practice of Numerical Classification. , 1975 .

[28]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[29]  Narendra Ahuja,et al.  Location- and Density-Based Hierarchical Clustering Using Similarity Analysis , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Jiawei Han,et al.  Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[31]  Rajeev Motwani,et al.  Incremental Clustering and Dynamic Information Retrieval , 2004, SIAM J. Comput..

[32]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[33]  Maurice D. Mulvenna,et al.  Discovering Internet marketing intelligence through online analytical web usage mining , 1998, SGMD.

[34]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[35]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[36]  Loren Terveen,et al.  PHOAKS: a system for sharing recommendations , 1997, CACM.

[37]  Anupam Joshi,et al.  Robust Fuzzy Clustering Methods to Support Web Mining , 1998 .

[38]  Philip S. Yu,et al.  Efficient Data Mining for Path Traversal Patterns , 1998, IEEE Trans. Knowl. Data Eng..

[39]  Anupam,et al.  Mining Web Access Logs Using Relational Competitive Fuzzy Clustering , 1999 .