Privacy-preserving SOM-based recommendations on horizontally distributed data

To produce predictions with decent accuracy, collaborative filtering algorithms need sufficient data. Due to the nature of online shopping and increasing amount of online vendors, different customers' preferences about the same products can be distributed among various companies, even competing vendors. Therefore, those companies holding inadequate number of users' data might decide to combine their data in such a way to present accurate predictions with acceptable online performance. However, they do not want to divulge their data, because such data are considered confidential and valuable. Furthermore, it is not legal disclosing users' preferences; nevertheless, if privacy is protected, they can collaborate to produce correct predictions. We propose a privacy-preserving scheme to provide recommendations on horizontally partitioned data among multiple parties. In order to improve online performance, the parties cluster their distributed data off-line without greatly jeopardizing their secrecy. They then estimate predictions using k-nearest neighbor approach while preserving their privacy. We demonstrate that the proposed method preserves data owners' privacy and is able to suggest predictions resourcefully. By performing several experiments using real data sets, we analyze our scheme in terms of accuracy. Our empirical outcomes show that it is still possible to estimate truthful predictions competently while maintaining data owners' confidentiality based on horizontally distributed data.

[1]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[2]  Yücel Saygin,et al.  Efficient Privacy Preserving Distributed Clustering Based on Secret Sharing , 2007, PAKDD Workshops.

[3]  Kenneth Y. Goldberg,et al.  Eigentaste: A Constant Time Collaborative Filtering Algorithm , 2001, Information Retrieval.

[4]  Yehuda Lindell,et al.  Secure Multiparty Computation for Privacy-Preserving Data Mining , 2009, IACR Cryptol. ePrint Arch..

[5]  Chris Clifton,et al.  Privacy-preserving clustering with distributed EM mixture modeling , 2004, Knowledge and Information Systems.

[6]  Young-Seuk Park,et al.  Self-Organizing Map , 2008 .

[7]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[8]  Sung-Bong Yang,et al.  Improving Prediction Quality in Collaborative Filtering Based on Clustering , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[9]  Tsvi Kuflik,et al.  Privacy-enhanced collaborative filtering , 2005 .

[10]  John Riedl,et al.  Application of Dimensionality Reduction in Recommender System - A Case Study , 2000 .

[11]  Wenliang Du,et al.  Effects of inconsistently masked data using RPT on CF with privacy , 2007, SAC '07.

[12]  Huseyin Polat,et al.  Providing Private Recommendations Using Naïve Bayesian Classifier , 2007, AWIC.

[13]  Anil K. Jain,et al.  A self-organizing network for hyperellipsoidal clustering (HEC) , 1996, IEEE Trans. Neural Networks.

[14]  Wenliang Du,et al.  Achieving Private Recommendations Using Randomized Response Techniques , 2006, PAKDD.

[15]  Sourav S. Bhowmick,et al.  PRIVATE-IYE: A Framework for Privacy Preserving Data Integration , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[16]  Douglas M. Blough,et al.  Privacy Preserving Collaborative Filtering Using Data Obfuscation , 2007 .

[17]  Chris Clifton,et al.  Privacy-preserving data integration and sharing , 2004, DMKD '04.

[18]  Hidetomo Ichihashi,et al.  Component-wise robust linear fuzzy clustering for collaborative filtering , 2004, Int. J. Approx. Reason..

[19]  John F. Canny,et al.  Collaborative filtering with privacy via factor analysis , 2002, SIGIR '02.

[20]  Jianhong Wu,et al.  Data clustering - theory, algorithms, and applications , 2007 .

[21]  Kenneth Y. Goldberg,et al.  Jester 2.0 (poster abstract): evaluation of an new linear time collaborative filtering algorithm , 1999, SIGIR '99.

[22]  Fillia Makedon,et al.  A privacy-preserving collaborative filtering scheme with two-way communication , 2006, EC '06.

[23]  John F. Canny,et al.  Collaborative filtering with privacy , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[24]  Hyunbo Cho,et al.  Improving memory-based collaborative filtering via similarity updating and prediction modulation , 2010, Inf. Sci..

[25]  Chris Clifton,et al.  Privacy Preserving Naïve Bayes Classifier for Vertically Partitioned Data , 2004, SDM.

[26]  Licia Capra,et al.  Private distributed collaborative filtering using estimated concordance measures , 2007, RecSys '07.

[27]  Qiang Yang,et al.  Scalable collaborative filtering using cluster-based smoothing , 2005, SIGIR '05.

[28]  Jesús Bobadilla,et al.  A new collaborative filtering metric that improves the behavior of recommender systems , 2010, Knowl. Based Syst..

[29]  Huseyin Polat,et al.  Providing Naïve Bayesian Classifier-Based Private Recommendations on Partitioned Data , 2007, PKDD.

[30]  Huseyin Polat,et al.  SOM-based recommendations with privacy on multi-party vertically distributed data , 2012, J. Oper. Res. Soc..

[31]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[32]  Fernando Ortega,et al.  A collaborative filtering approach to mitigate the new user cold start problem , 2012, Knowl. Based Syst..

[33]  Ken Goldberg,et al.  Jester 2.0: Evaluation of an New Linear Time Collaborative Filtering Algorithm (poster abstract). , 1999, Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.

[34]  Kun Liu,et al.  Random projection-based multiplicative data perturbation for privacy preserving distributed data mining , 2006, IEEE Transactions on Knowledge and Data Engineering.

[35]  Derek G. Bridge,et al.  An Accurate and Scalable Collaborative Recommender , 2004, Artificial Intelligence Review.

[36]  Wenliang Du,et al.  Privacy-preserving top- N recommendation on distributed data , 2008 .

[37]  Hyung Jun Ahn,et al.  A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem , 2008, Inf. Sci..

[38]  Fernando Ortega,et al.  Improving collaborative filtering recommender system results and performance using genetic algorithms , 2011, Knowl. Based Syst..

[39]  Kyong Joo Oh,et al.  The collaborative filtering recommendation based on SOM cluster-indexing CBR , 2003, Expert Syst. Appl..

[40]  Qingsheng Zhu,et al.  Incremental Collaborative Filtering recommender based on Regularized Matrix Factorization , 2012, Knowl. Based Syst..

[41]  Sung Jin Hur,et al.  Improved trust-aware recommender system using small-worldness of trust networks , 2010, Knowl. Based Syst..

[42]  Da Ruan,et al.  Consensus clustering based on constrained self-organizing map and improved Cop-Kmeans ensemble in intelligent decision support systems , 2012, Knowl. Based Syst..

[43]  Eric R. Ziegel,et al.  Mastering Data Mining , 2001, Technometrics.