Collaborative filtering with privacy via factor analysis

Collaborative filtering (CF) is valuable in e-commerce, and for direct recommendations for music, movies, news etc. But today's systems have several disadvantages, including privacy risks. As we move toward ubiquitous computing, there is a great potential for individuals to share all kinds of information about places and things to do, see and buy, but the privacy risks are severe. In this paper we describe a new method for collaborative filtering which protects the privacy of individual data. The method is based on a probabilistic factor analysis model. Privacy protection is provided by a peer-to-peer protocol which is described elsewhere, but outlined in this paper. The factor analysis approach handles missing data without requiring default values for them. We give several experiments that suggest that this is most accurate method for CF to date. The new algorithm has other advantages in speed and storage over previous algorithms. Finally, we suggest applications of the approach to other kinds of statistical analyses of survey or questionaire data.

[1]  E. Rogers,et al.  Diffusion of Innovations , 1964 .

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  E. Rogers Diffusion of Innovations, Fourth Edition , 1982 .

[4]  O. John The "Big Five" factor taxonomy: Dimensions of personality in the natural language and in questionnaires. , 1990 .

[5]  Michael I. Jordan,et al.  Learning from Incomplete Data , 1994 .

[6]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[7]  William W. Cohen,et al.  Recommendation as Classification: Using Social and Content-Based Information in Recommendation , 1998, AAAI/IAAI.

[8]  John Riedl,et al.  Combining Collaborative Filtering with Personal Agents for Better Recommendations , 1999, AAAI/IAAI.

[9]  Mark Claypool,et al.  Combining Content-Based and Collaborative Filters in an Online Newspaper , 1999, SIGIR 1999.

[10]  John Riedl,et al.  An algorithmic framework for performing collaborative filtering , 1999, SIGIR '99.

[11]  Ken Goldberg,et al.  Jester 2.0: Evaluation of an New Linear Time Collaborative Filtering Algorithm (poster abstract). , 1999, Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.

[12]  Kenneth Y. Goldberg,et al.  Jester 2.0 (poster abstract): evaluation of an new linear time collaborative filtering algorithm , 1999, SIGIR '99.

[13]  B. Frey Turbo Factor Analysis , 1999 .

[14]  Eric Horvitz,et al.  Collaborative Filtering by Personality Diagnosis: A Hybrid Memory and Model-Based Approach , 2000, UAI.

[15]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[16]  John Riedl,et al.  Application of Dimensionality Reduction in Recommender System - A Case Study , 2000 .

[17]  P. Gehler,et al.  An introduction to graphical models , 2001 .

[18]  David M. Pennock,et al.  Probabilistic Models for Unified Collaborative and Content-Based Recommendation in Sparse-Data Environments , 2001, UAI.

[19]  John F. Canny,et al.  Collaborative filtering with privacy , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[20]  John Canny,et al.  Some Techniques for Privacy in Ubicomp and Context-Aware Applications , 2002 .