Yoda: An Accurate and Scalable Web-Based Recommendation System

Recommendation systems are applied to personalize and customize the Web environment. We have developed a recommendation system, termed Yoda, that is designed to support large-scale Web-based applications requiring highly accurate recommendations in real-time. With Yoda, we introduce a hybrid approach that combines collaborative filtering (CF) and content-based querying to achieve higher accuracy. Yoda is structured as a tunable model that is trained off-line and employed for real-time recommendation on-line. The on-line process benefits from an optimized aggregation function with low complexity that allows realtime weighted aggregation of the soft classification of active users to predefined recommendation sets. Leveraging on localized distribution of the recommendable items, the same aggregation function is further optimized for the off-line process to reduce the time complexity of constructing the pre-defined recommendation sets of the model. To make the off-line process scalable furthermore, we also propose a filtering mechanism, FLSH, that extends the Locality Sensitive Hashing technique by incorporating a novel distance measure that satisfies specific requirements of our application. Our end-to-end experiments show while Yoda's complexity is low and remains constant as the number of users and/or items grow, its accuracy surpasses that of the basic nearest-neighbor method by a wide margin (in most cases more than 100%).

[1]  Ronald Fagin,et al.  Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[2]  Bradley N. Miller,et al.  GroupLens: applying collaborative filtering to Usenet news , 1997, CACM.

[3]  John Riedl,et al.  Combining Collaborative Filtering with Personal Agents for Better Recommendations , 1999, AAAI/IAAI.

[4]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[5]  Ronald Fagin,et al.  Combining fuzzy information from multiple systems (extended abstract) , 1996, PODS.

[6]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[7]  John Riedl,et al.  Analysis of recommendation algorithms for e-commerce , 2000, EC '00.

[8]  Jerry M. Mendel,et al.  Operations on type-2 fuzzy sets , 2001, Fuzzy Sets Syst..

[9]  Yi-Shin Chen,et al.  Soft query in image retrieval systems , 1999, Electronic Imaging.

[10]  Farnoush Banaei Kashani,et al.  Feature Matrices: A Model for Efficient and Anonymous Web Usage Mining , 2001, EC-Web.

[11]  John Riedl,et al.  Application of Dimensionality Reduction in Recommender System - A Case Study , 2000 .

[12]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[13]  Yi-Shin Chen,et al.  A Unified Framework to Incorporate Soft Query into Image Retrieval Systems , 2001, ICEIS.

[14]  Brendan Kitts,et al.  Cross-sell: a fast promotion-tunable customer-item recommendation method based on conditionally independent probabilities , 2000, KDD '00.

[15]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[16]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[17]  Cyrus Shahabi,et al.  Knowledge discovery from users Web-page navigation , 1997, Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications.

[18]  Bradley N. Miller,et al.  Applying Collaborative Filtering to Usenet News , 1997 .