Collaborative Ensemble Learning: Combining Collaborative and Content-Based Information Filtering via Hierarchical Bayes

Collaborative filtering (CF) and content-based filtering (CBF) have widely been used information filtering applications, both approaches having their individual strengths and weaknesses. This paper proposes a novel probabilistic framework to unify CF and CBF, named collaborative ensemble learning. Based on content based probabilistic models for each user's preferences (the CBF idea), it combines a society of users' preferences to predict an active user's preferences (the CF idea). While retaining an intuitive explanation, the combination scheme can be interpreted as a hierarchical Bayesian approach in which a common prior distribution is learned from related experiments. It does not require a global training stage and thus can incrementally incorporate new data. We report results based on two data sets, the neuters-21578 text data set and a data base of user opionions on art images. For both data sets, collaborative ensemble achieved excellent performance in terms of recommendation accuracy. In addition to recommendation engines, collaborative ensemble learning is applicable to problems typically solved via classical hierarchical Bayes, like multisensor fusion and multitask learning.

[1]  Michael J. Pazzani,et al.  Syskill & Webert: Identifying Interesting Web Sites , 1996, AAAI/IAAI, Vol. 1.

[2]  Tong Zhang,et al.  Recommender systems using linear classifiers , 2002 .

[3]  Wee Sun Lee Collaborative Learning and Recommender Systems , 2001, ICML.

[4]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[5]  D. Heckerman,et al.  Dependency networks for inference , 2000 .

[6]  Loriene Roy,et al.  Content-based book recommending using learning for text categorization , 1999, DL '00.

[7]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[8]  Tom Heskes,et al.  Empirical Bayes for Learning to Learn , 2000, ICML.

[9]  William W. Cohen,et al.  Recommendation as Classification: Using Social and Content-Based Information in Recommendation , 1998, AAAI/IAAI.

[10]  Thomas Hofmann,et al.  Latent Class Models for Collaborative Filtering , 1999, IJCAI.

[11]  Wee Sun Lee Collaborative Learning for Recommender Systems , 2001 .

[12]  James Llinas,et al.  Handbook of Multisensor Data Fusion , 2001 .

[13]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[14]  David M. Pennock,et al.  Probabilistic Models for Unified Collaborative and Content-Based Recommendation in Sparse-Data Environments , 2001, UAI.

[15]  Sebastian Thrun,et al.  Discovering Structure in Multiple Learning Tasks: The TC Algorithm , 1996, ICML.

[16]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[17]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[18]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[19]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[20]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[21]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[22]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[23]  David C. Gibbon,et al.  Relevance Feedback using Support Vector Machines , 2001, ICML.

[24]  Michael J. Pazzani,et al.  A Framework for Collaborative, Content-Based and Demographic Filtering , 1999, Artificial Intelligence Review.

[25]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[26]  P. Bartlett,et al.  Probabilities for SV Machines , 2000 .

[27]  Michael J. Pazzani,et al.  Learning Collaborative Information Filters , 1998, ICML.

[28]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[29]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[30]  Lorien Y. Pratt,et al.  Discriminability-Based Transfer between Neural Networks , 1992, NIPS.

[31]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[32]  Yoav Shoham,et al.  Content-Based, Collaborative Recommendation. , 1997 .