Improving the Utility of Differentially Private Clustering through Dynamical Processing

This study aims to alleviate the trade-off between utility and privacy in the task of differentially private clustering. Existing works focus on simple clustering methods, which show poor clustering performance for non-convex clusters. By utilizing Morse theory, we hierarchically connect the Gaussian sub-clusters to fit complex cluster distributions. Because differentially private sub-clusters are obtained through the existing methods, the proposed method causes little or no additional privacy loss. We provide a theoretical background that implies that the proposed method is inductive and can achieve any desired number of clusters. Experiments on various datasets show that our framework achieves better clustering performance at the same privacy level, compared to the existing methods.

[1]  Haim Kaplan,et al.  Differentially-Private Clustering of Easy Instances , 2021, ICML.

[2]  Yuzhen Ye Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[3]  Yaling Zhang,et al.  Differential privacy fuzzy C-means clustering algorithm based on gaussian kernel function , 2021, PloS one.

[4]  Uri Stemmer,et al.  Private k-Means Clustering with Stability Assumptions , 2020, AISTATS.

[5]  Sheng Zhong,et al.  Distributed K-Means clustering guaranteeing local differential privacy , 2020, Comput. Secur..

[6]  Jonathan Ullman,et al.  Differentially Private Algorithms for Learning Mixtures of Separated Gaussians , 2019, 2020 Information Theory and Applications Workshop (ITA).

[7]  Zhiyi Huang,et al.  Optimal Differentially Private Algorithms for k-Means Clustering , 2018, PODS.

[8]  Emiliano De Cristofaro,et al.  Differentially Private Mixture of Generative Neural Networks , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[9]  Philip S. Yu,et al.  Differentially Private Data Publishing and Analysis: A Survey , 2017, IEEE Transactions on Knowledge and Data Engineering.

[10]  Maria-Florina Balcan,et al.  Differentially Private Clustering in High-Dimensional Euclidean Spaces , 2017, ICML.

[11]  Yonglong Luo,et al.  Outlier-eliminated k-means clustering algorithm based on differential privacy preservation , 2016, Applied Intelligence.

[12]  James R. Foulds,et al.  DP-EM: Differentially Private Expectation Maximization , 2016, AISTATS.

[13]  Thomas Steinke,et al.  Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds , 2016, TCC.

[14]  Ninghui Li,et al.  Differentially Private K-Means Clustering , 2015, CODASPY.

[15]  Stavros Papadopoulos,et al.  Engineering Methods for Differentially Private Histograms: Efficiency Beyond Utility , 2015, IEEE Transactions on Knowledge and Data Engineering.

[16]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[17]  Bhiksha Raj,et al.  Large Margin Gaussian Mixture Models with Differential Privacy , 2012, IEEE Transactions on Dependable and Secure Computing.

[18]  Daewon Lee,et al.  Dynamic Dissimilarity Measure for Support-Based Clustering , 2010, IEEE Transactions on Knowledge and Data Engineering.

[19]  Haim Kaplan,et al.  Private coresets , 2009, STOC '09.

[20]  Daewon Lee,et al.  Constructing Sparse Kernel Machines Using Attractors , 2009, IEEE Transactions on Neural Networks.

[21]  Bamshad Mobasher,et al.  Personalized recommendation in social tagging systems using hierarchical clustering , 2008, RecSys '08.

[22]  Allou Samé,et al.  An online classification EM algorithm based on the mixture model , 2007, Stat. Comput..

[23]  Daewon Lee,et al.  A quadratic string adapted barrier exploring method for locating transition states , 2007, Comput. Phys. Commun..

[24]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[25]  Jih-Jeng Huang,et al.  Marketing segmentation using support vector clustering , 2007, Expert Syst. Appl..

[26]  Daewon Lee,et al.  Dynamic Characterization of Cluster Structures for Robust and Inductive Support Vector Clustering , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[28]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[29]  Hsiao-Dong Chiang,et al.  A dynamical trajectory-based methodology for systematically computing multiple optimal solutions of general nonlinear programming problems , 2004, IEEE Trans. Autom. Control..

[30]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[31]  Yishay Mansour,et al.  An Information-Theoretic Analysis of Hard and Soft Assignment Methods for Clustering , 1997, UAI.

[32]  H. Chiang,et al.  Quasi-stability regions of nonlinear dynamical systems: theory , 1996 .

[33]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  G. Celeux,et al.  A Classification EM algorithm for clustering and two stochastic versions , 1992 .

[35]  Hubertus Th. Jongen,et al.  Nonlinear optimization in IRN , 1987 .

[36]  S. A. Robertson,et al.  NONLINEAR OSCILLATIONS, DYNAMICAL SYSTEMS, AND BIFURCATIONS OF VECTOR FIELDS (Applied Mathematical Sciences, 42) , 1984 .

[37]  P. Holmes,et al.  Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields , 1983, Applied Mathematical Sciences.

[38]  S. Smale On Gradient Dynamical Systems , 1961 .

[39]  Lina Ni,et al.  DP-MCDBSCAN: Differential Privacy Preserving Multi-Core DBSCAN Clustering for Network User Data , 2018, IEEE Access.

[40]  Andrei Sorin Sabau Survey of Clustering Based Financial Fraud Detection Research , 2012 .

[41]  Matthias Seeger,et al.  Learning from Labeled and Unlabeled Data , 2010, Encyclopedia of Machine Learning.

[42]  Hassan K. Khalil,et al.  Nonlinear Systems Third Edition , 2008 .

[43]  Tzong-Jer Chen,et al.  Fuzzy c-means clustering with spatial information for image segmentation , 2006, Comput. Medical Imaging Graph..

[44]  E. Suhubi Nonlinear oscillations, dynamical systems, and bifurcations of vector fields: Applied Mathematical Science, Vol. 42, J. Guckenheimer and P. Holmes, Springer-Verlag, New York, Berlin, Heidelberg, Tokyo (1983). XVI + 453 pp., 206 figs, DM 104 , 1988 .

[45]  S. Smale,et al.  A generalized Morse theory , 1964 .