Enabling scalable online personalization on the Web

Online personalization is of great interest to e-companies. Virtually all personalization technologies are based on the idea of storing as much historical customer session data as possible, and then querying the data store as customers navigate through a web site. The holy grail of on-line personalization is an environment where ne-grained, detailed historical session data can be queried based on current online navigation patterns for use in formulating real-time responses. Unfortunately, as more consumers become e-shoppers, the user load and the amount of historical data continue to increase, causing scalability-related problems for almost all current personalization technologies. This paper describes the development of a real-time interaction management engine through the integration of historical data and on-line visitation patterns of e-commerce site visitors. This paper describes the scienti c underpinnings of the system, as well as the architecture and a performance evaluation. The experimental evaluation shows that our caching and storage techniques deliver performance that is orders of magnitude better than those derived from o -the-shelf database components.

[1]  Jeffrey C. Mogul,et al.  Using predictive prefetching to improve World Wide Web latency , 1996, CCRV.

[2]  Cyrus Shahabi,et al.  Knowledge discovery from users Web-page navigation , 1997, Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications.

[3]  Philip K. Chan,et al.  A Non-Invasive Learning Approach to Building Web User Profiles , 1999 .

[4]  F. F. Reichheld,et al.  Zero defections: quality comes to services. , 1990, Harvard business review.

[5]  Shamkant B. Navathe,et al.  An architecture to support scalable online personalization on the Web , 2001, The VLDB Journal.

[6]  Paolo Merialdo,et al.  Semistructured and structured data in the Web: going back and forth , 1997, SGMD.

[7]  Dan Suciu,et al.  Optimization of Run-time Management of Data Intensive Web-sites , 1999, VLDB.

[8]  Krithi Ramamritham,et al.  Curio: A Novel Solution for Efficient Storage and Indexing in Data Warehouses , 1999, VLDB.

[9]  Myra Spiliopoulou,et al.  WUM - A Tool for WWW Ulitization Analysis , 1998, WebDB.

[10]  Jia Wang,et al.  A survey of web caching schemes for the Internet , 1999, CCRV.

[11]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[12]  Abraham Silberschatz,et al.  Database System Concepts , 1980 .

[13]  Myra Spiliopoulou,et al.  WUM: A tool for Web Utilization analysis , 1999 .

[14]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[15]  Carlos R. Cunha,et al.  Determining WWW user's next access and its application to pre-fetching , 1997, Proceedings Second IEEE Symposium on Computer and Communications.

[16]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[17]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[18]  Edith Cohen,et al.  Efficient algorithms for predicting requests to Web servers , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[19]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[20]  Mark Crovella,et al.  Characteristics of WWW Client-based Traces , 1995 .

[21]  Maurice Mulvenna,et al.  Navigation Pattern Discovery from Internet Data , 1999 .