A fuzzy variant of k-member clustering for collaborative filtering with data anonymization

Privacy preserving data mining is a promising approach for encouraging users to exploit the IT supports without fear of information leaks. k-member clustering is a basic technique for achieving k-anonymization, in which data samples are summarized so that any sample is indistinguishable from at least k - 1 other samples. This paper proposes a fuzzy variant of k-member clustering with the goal of improving the quality of data summarization with k-anonymity. Each k-member cluster is extracted considering the fuzzy membership degrees of samples, which are estimated based on the distance from clusters. The proposed anonymization method is also applied to collaborative filtering, in which the main task is estimation of the applicability of unevaluated items. Several experimental results demonstrate the characteristic features of the proposed anonymization method.

[1]  Dean P. Foster,et al.  Clustering Methods for Collaborative Filtering , 1998, AAAI 1998.

[2]  Philip S. Yu,et al.  Privacy-Preserving Data Mining - Models and Algorithms , 2008, Advances in Database Systems.

[3]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[4]  Hidetomo Ichihashi,et al.  Collaborative filtering by sequential user-item co-cluster extraction from rectangular relational data , 2010, Int. J. Knowl. Eng. Soft Data Paradigms.

[5]  James C. Bezdek,et al.  Fuzzy c-means clustering of incomplete data , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[6]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[7]  Bradley N. Miller,et al.  GroupLens: applying collaborative filtering to Usenet news , 1997, CACM.

[8]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[9]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[10]  Hidetomo Ichihashi,et al.  A Study on Privacy Preserving Collaborative Filtering with Data Anonymization by Clustering , 2012, IIMSS.

[11]  Wenliang Du,et al.  Privacy-preserving collaborative filtering using randomized perturbation techniques , 2003, Third IEEE International Conference on Data Mining.

[12]  Hidetomo Ichihashi,et al.  Collaborative Filtering Using Principal Component Analysis and Fuzzy Clustering , 2001, Web Intelligence.

[13]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[14]  Xinyuan Liang Reasoning Algorithm of Multi-Value Fuzzy Causality Diagram Based on Unitizing Coefficient , 2007 .

[15]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[16]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[17]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[18]  John Riedl,et al.  An algorithmic framework for performing collaborative filtering , 1999, SIGIR '99.

[19]  Douglas M. Blough,et al.  Privacy Preserving Collaborative Filtering Using Data Obfuscation , 2007, 2007 IEEE International Conference on Granular Computing (GRC 2007).

[20]  Elisa Bertino,et al.  Efficient k -Anonymization Using Clustering Techniques , 2007, DASFAA.

[21]  W. Peizhuang Pattern Recognition with Fuzzy Objective Function Algorithms (James C. Bezdek) , 1983 .