Data mining and personalization technologies

Data mining has become increasingly popular and is widely used in various application areas. In this paper, we examine new developments in data mining and its application to personalization in E-commerce. Personalization is what merchants and publishers want to do to tailor the Web site or advertisement and product promotion to a customer based on his past behavior and inference from other like-minded people. E-commerce offers the opportunity to deploy this type of one-to-one marketing instead of the traditional mass marketing. The technology challenges to support personalization are discussed. These include the need to perform clustering and searching in very high dimensional data space with huge amount of data. We examine some of the new data mining technologies developed that can support personalization.

[1]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[2]  Philip S. Yu,et al.  A new framework for itemset generation , 1998, PODS '98.

[3]  Jorma Rissanen,et al.  SLIQ: A Fast Scalable Classifier for Data Mining , 1996, EDBT.

[4]  Yoav Shoham,et al.  Content-Based, Collaborative Recommendation. , 1997 .

[5]  Philip S. Yu,et al.  Using a Hash-Based Method with Transaction Trimming for Mining Association Rules , 1997, IEEE Trans. Knowl. Data Eng..

[6]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[7]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[8]  Hans-Peter Kriegel,et al.  Optimal multi-step k-nearest neighbor search , 1998, SIGMOD '98.

[9]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[10]  Hans-Peter Kriegel,et al.  A Database Interface for Clustering in Large Spatial Databases , 1995, KDD.

[11]  Philip S. Yu,et al.  Fast algorithms for projected clustering , 1999, SIGMOD '99.

[12]  Philip S. Yu,et al.  A Framework for the Optimizing of WWW Advertising , 1998, Trends in Distributed Systems for Electronic Commerce.

[13]  Philip S. Yu,et al.  Online generation of association rules , 1998, Proceedings 14th International Conference on Data Engineering.

[14]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[15]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[16]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[17]  Marti A. Hearst,et al.  Reexamining the cluster hypothesis: scatter/gather on retrieval results , 1996, SIGIR '96.

[18]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[19]  Paul Resnick,et al.  Recommender systems , 1997, CACM.

[20]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[21]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[22]  Philip S. Yu,et al.  A new method for similarity indexing of market basket data , 1999, SIGMOD '99.

[23]  Bruce Krulwich,et al.  Learning user information interests through extraction of semantically significant phrases , 1996 .

[24]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[25]  Ming-Syan Chen,et al.  Using multi-attribute predicates for mining classification rules , 1998, Proceedings. The Twenty-Second Annual International Computer Software and Applications Conference (Compsac '98) (Cat. No.98CB 36241).

[26]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[27]  Hans-Peter Kriegel,et al.  Knowledge Discovery in Large Spatial Databases: Focusing Techniques for Efficient Class Identification , 1995, SSD.

[28]  Philip S. Yu,et al.  Online algorithms for finding profile association rules , 1998, CIKM '98.

[29]  Ron Kohavi,et al.  Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology , 1995, KDD.

[30]  Isidore Rigoutsos,et al.  An algorithm for point clustering and grid generation , 1991, IEEE Trans. Syst. Man Cybern..

[31]  David R. Karger,et al.  Constant interaction-time scatter/gather browsing of very large document collections , 1993, SIGIR.

[32]  Christian Böhm,et al.  A cost model for nearest neighbor search in high-dimensional data space , 1997, PODS.

[33]  Douglas B. Terry,et al.  Using collaborative filtering to weave an information tapestry , 1992, CACM.

[34]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[35]  Rakesh Agrawal,et al.  SPRINT: A Scalable Parallel Classifier for Data Mining , 1996, VLDB.

[36]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[37]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[38]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.