Combinational collaborative filtering for personalized community recommendation

Rapid growth in the amount of data available on social networking sites has made information retrieval increasingly challenging for users. In this paper, we propose a collaborative filtering method, Combinational Collaborative Filtering (CCF), to perform personalized community recommendations by considering multiple types of co-occurrences in social data at the same time. This filtering method fuses semantic and user information, then applies a hybrid training strategy that combines Gibbs sampling and Expectation-Maximization algorithm. To handle the large-scale dataset, parallel computing is used to speed up the model training. Through an empirical study on the Orkut dataset, we show CCF to be both effective and scalable.

[1]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[5]  Douglas B. Terry,et al.  Using collaborative filtering to weave an information tapestry , 1992, CACM.

[6]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[7]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[8]  Bart Selman,et al.  Referral Web: combining social networks and collaborative filtering , 1997, CACM.

[9]  Bradley N. Miller,et al.  Applying Collaborative Filtering to Usenet News , 1997 .

[10]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[11]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[12]  Thomas Hofmann,et al.  Latent Class Models for Collaborative Filtering , 1999, IJCAI.

[13]  David Cohn,et al.  Learning to Probabilistically Identify Authoritative Documents , 2000, ICML.

[14]  David A. Cohn,et al.  The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity , 2000, NIPS.

[15]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[16]  Tong Zhang,et al.  Recommender systems using linear classifiers , 2002 .

[17]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[18]  Robert B. Ross,et al.  Using MPI-2: Advanced Features of the Message Passing Interface , 2003, CLUSTER.

[19]  Matthew Brand,et al.  Fast Online SVD Revisions for Lightweight Recommender Systems , 2003, SDM.

[20]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[21]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[22]  Thomas Hofmann,et al.  Latent semantic models for collaborative filtering , 2004, TOIS.

[23]  Paul R. Cohen,et al.  Bayesian Clustering by Dynamics Contents 1 Introduction 1 2 Clustering Markov Chains 2 , 2022 .

[24]  Michael I. Jordan,et al.  Variational methods for the Dirichlet process , 2004, ICML.

[25]  George Karypis,et al.  Item-based top-N recommendation algorithms , 2004, TOIS.

[26]  Joydeep Ghosh,et al.  Under Consideration for Publication in Knowledge and Information Systems Generative Model-based Document Clustering: a Comparative Study , 2003 .

[27]  Thomas L. Griffiths,et al.  Probabilistic author-topic models for information discovery , 2004, KDD.

[28]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[29]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Mehran Sahami,et al.  Evaluating similarity measures: a large-scale study in the orkut social network , 2005, KDD '05.

[31]  Lada A. Adamic,et al.  How to search a social network , 2005, Soc. Networks.

[32]  Andrew McCallum,et al.  The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks: Experiments with Enron and Academic Email , 2005 .

[33]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[34]  Max Welling,et al.  Distributed Inference for Latent Dirichlet Allocation , 2007, NIPS.

[35]  Edward Y. Chang,et al.  Parallel Spectral Clustering , 2008, ECML/PKDD.

[36]  Edward Y. Chang,et al.  Collaborative filtering for orkut communities: discovery of user latent behavior , 2009, WWW '09.