A New Algorithm for Performing Ratings-Based Collaborative Filtering

Collaborative filtering is the most successful recommender system technology to date. It has been shown to produce high quality recommendations, but the performance degrades with the number of customers and products. In this paper, according to the feature of the rating data, we present a new similarity function Hsim(), and a signature table-based Algorithm for performing collaborative filtering. This method partitions the original data into sets of signature, then establishes a signature table to avoid a sequential scan. Our preliminary experiments based on a number of real data sets show that the new method can both improve the scalability and quality of collaborative filtering. Because the new method applies data clustering algorithms to rating data, predictions can be computed independently within one or a few partitions. Ideally, partition will improve the quality of collaborative filtering predictions. We'll continue to study how to further improve the quality of predictions in the future research.

[1]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[2]  Brendan Kitts,et al.  Cross-sell: a fast promotion-tunable customer-item recommendation method based on conditionally independent probabilities , 2000, KDD '00.

[3]  John Riedl,et al.  Analysis of recommendation algorithms for e-commerce , 2000, EC '00.

[4]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[5]  George Karypis,et al.  Evaluation of Item-Based Top-N Recommendation Algorithms , 2001, CIKM '01.

[6]  Philip S. Yu,et al.  Horting hatches an egg: a new graph-theoretic approach to collaborative filtering , 1999, KDD '99.

[7]  Bradley N. Miller,et al.  GroupLens: applying collaborative filtering to Usenet news , 1997, CACM.

[8]  Yang Feng,et al.  An Efficient Method for Similarity Search on Quantitative Transaction Data , 2004 .

[9]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[10]  Shashi Shekhar,et al.  Multilevel hypergraph partitioning: application in VLSI domain , 1997, DAC.

[11]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[12]  John Riedl,et al.  Application of Dimensionality Reduction in Recommender System - A Case Study , 2000 .

[13]  John Riedl,et al.  Combining Collaborative Filtering with Personal Agents for Better Recommendations , 1999, AAAI/IAAI.

[14]  Bradley N. Miller,et al.  Using filtering agents to improve prediction quality in the GroupLens research collaborative filtering system , 1998, CSCW '98.

[15]  Michael J. Pazzani,et al.  Learning Collaborative Information Filters , 1998, ICML.

[16]  Philip S. Yu,et al.  A new method for similarity indexing of market basket data , 1999, SIGMOD '99.

[17]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[18]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[19]  Douglas B. Terry,et al.  Using collaborative filtering to weave an information tapestry , 1992, CACM.

[20]  Shoshana Loeb,et al.  Information filtering , 1992, CACM.

[21]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.