On the Noise-Information Separation of a Private Principal Component Analysis Scheme

In a survey disclosure model, we consider an additive noise privacy mechanism and study the trade-off between privacy guarantees and statistical utility. Privacy is approached from two different but complementary viewpoints: information and estimation theoretic. Motivated by the performance of principal component analysis, statistical utility is measured via the spectral gap of a certain covariance matrix. This formulation and its motivation rely on classical results from random matrix theory. We prove some properties of this statistical utility function and discuss a simple numerical method to evaluate it.

[1]  Terence Tao,et al.  Random matrices: Universality of ESDs and the circular law , 2008, 0807.4898.

[2]  Alfred O. Hero,et al.  Analysis of a privacy-preserving PCA algorithm using random matrix theory , 2016, 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[3]  Ken R. Duffy,et al.  Principal Inertia Components and Applications , 2017, IEEE Transactions on Information Theory.

[4]  J. W. Silverstein,et al.  No eigenvalues outside the support of the limiting spectral distribution of large-dimensional sample covariance matrices , 1998 .

[5]  R. Muirhead Aspects of Multivariate Statistical Theory , 1982, Wiley Series in Probability and Statistics.

[6]  P. Biane On the free convolution with a semi-circular distribution , 1997 .

[7]  Alexandru Nica,et al.  Lectures on the Combinatorics of Free Probability: Transforms and models , 2006 .

[8]  Sergio Verdú,et al.  Sensitivity of channel capacity , 1995, IEEE Trans. Inf. Theory.

[9]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[10]  Fady Alajaji,et al.  Privacy-aware MMSE estimation , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[11]  J. W. Silverstein Strong convergence of the empirical distribution of eigenvalues of large dimensional random matrices , 1995 .

[12]  J. W. Silverstein,et al.  Spectral Analysis of Large Dimensional Random Matrices , 2009 .

[13]  F. Benaych-Georges Rectangular random matrices, related free entropy and free Fisher's information , 2005, math/0512081.

[14]  Fady Alajaji,et al.  Information Extraction Under Privacy Constraints , 2015, Inf..

[15]  H. Vincent Poor,et al.  Utility-Privacy Tradeoffs in Databases: An Information-Theoretic Approach , 2011, IEEE Transactions on Information Forensics and Security.

[16]  Shlomo Shamai,et al.  Mutual information and minimum mean-square error in Gaussian channels , 2004, IEEE Transactions on Information Theory.

[17]  Anand D. Sarwate,et al.  A near-optimal algorithm for differentially-private principal components , 2012, J. Mach. Learn. Res..