A Greedy Algorithm for k-Member Co-clustering and its Applicability to Collaborative Filtering

Abstract Privacy preserving data mining is an important issue in network societies and co-clustering is a basic technique for analyzing intrinsic data structures in cooccurrence information among objects and items. In this paper, a greedy algorithm for k-member clustering, which achieves k-anonymity by coding at least k records into a solo observation, is enhanced to a co-clustering model. In the greedy algorithm, k-member clusters are sequentially extracted one-by-one, where each cluster is composed of homogeneous objects. In numerical experiments, the applicability of the proposed algorithm to collaborative filtering tasks is discussed.

[1]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[3]  Hidetomo Ichihashi,et al.  Collaborative filtering by sequential user-item co-cluster extraction from rectangular relational data , 2010, Int. J. Knowl. Eng. Soft Data Paradigms.

[4]  Hidetomo Ichihashi,et al.  Fuzzy clustering for categorical multivariate data , 2001, Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569).

[5]  Hidetomo Ichihashi,et al.  A fuzzy variant of k-member clustering for collaborative filtering with data anonymization , 2012, 2012 IEEE International Conference on Fuzzy Systems.

[6]  Rajeev Motwani,et al.  Approximation Algorithms for k-Anonymity , 2005 .

[7]  Erkki Oja,et al.  Linear expansions with nonlinear cost functions: modeling, representation, and partitioning , 2010 .

[8]  Philip S. Yu,et al.  Privacy-Preserving Data Mining - Models and Algorithms , 2008, Advances in Database Systems.

[9]  Nathan Green,et al.  Evolutionary spectral co-clustering , 2011, The 2011 International Joint Conference on Neural Networks.

[10]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[11]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[12]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[13]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[14]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[15]  Bradley N. Miller,et al.  GroupLens: applying collaborative filtering to Usenet news , 1997, CACM.

[16]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[17]  Elisa Bertino,et al.  Efficient k -Anonymization Using Clustering Techniques , 2007, DASFAA.