A Differential Privacy Mechanism Design Under Matrix-Valued Query

Traditionally, differential privacy mechanism design has been tailored for a scalar-valued query function. Although many mechanisms such as the Laplace and Gaussian mechanisms can be extended to a matrix-valued query function by adding i.i.d. noise to each element of the matrix, this method is often sub-optimal as it forfeits an opportunity to exploit the structural characteristics typically associated with matrix analysis. In this work, we consider the design of differential privacy mechanism specifically for a matrix-valued query function. The proposed solution is to utilize a matrix-variate noise, as opposed to the traditional scalar-valued noise. Particularly, we propose a novel differential privacy mechanism called the Matrix-Variate Gaussian (MVG) mechanism, which adds a matrix-valued noise drawn from a matrix-variate Gaussian distribution. We prove that the MVG mechanism preserves $(\epsilon,\delta)$-differential privacy, and show that it allows the structural characteristics of the matrix-valued query function to naturally be exploited. Furthermore, due to the multi-dimensional nature of the MVG mechanism and the matrix-valued query, we introduce the concept of directional noise, which can be utilized to mitigate the impact the noise has on the utility of the query. Finally, we demonstrate the performance of the MVG mechanism and the advantages of directional noise using three matrix-valued queries on three privacy-sensitive datasets. We find that the MVG mechanism notably outperforms four previous state-of-the-art approaches, and provides comparable utility to the non-private baseline. Our work thus presents a promising prospect for both future research and implementation of differential privacy for matrix-valued query functions.

[1]  Jorma K. Merikoski,et al.  Bounds for singular values using traces , 1994 .

[2]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[3]  Johannes Gehrke,et al.  Differential privacy via wavelet transforms , 2009, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[4]  James Bennett,et al.  The Netflix Prize , 2007 .

[5]  Yu-Hsien Peng On Singular Values of Random Matrices , 2015 .

[6]  Stephen E. Fienberg,et al.  Differential Privacy and the Risk-Utility Tradeoff for Multi-dimensional Contingency Tables , 2010, Privacy in Statistical Databases.

[7]  Roy Rada,et al.  Machine learning - applications in expert systems and information retrieval , 1986, Ellis Horwood series in artificial intelligence.

[8]  Kamalika Chaudhuri,et al.  A Stability-based Validation Procedure for Differentially Private Machine Learning , 2013, NIPS.

[9]  Paul W. Cuff,et al.  Differential Privacy as a Mutual Information Constraint , 2016, CCS.

[10]  H. Vincent Poor,et al.  MVG Mechanism: Differential Privacy under Matrix-Valued Query , 2018, CCS.

[11]  William Carlisle Thacker,et al.  The role of the Hessian matrix in fitting models to measurements , 1989 .

[12]  Prateek Mittal,et al.  Coupling Dimensionality Reduction with Generative Model for Non-Interactive Private Data Release , 2017, ArXiv.

[13]  Haoran Li,et al.  DPCube: Differentially Private Histogram Release through Multidimensional Partitioning , 2012, Trans. Data Priv..

[14]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[15]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[16]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[17]  J. Hadamard,et al.  Etude sur les propriétés des fonctions entières et en particulier d'une fonction considérée par Riemann , 1893 .

[18]  Aaron Roth,et al.  Fast Private Data Release Algorithms for Sparse Queries , 2013, APPROX-RANDOM.

[19]  Guy N. Rothblum,et al.  A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[20]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[21]  A. R. Meenakshi,et al.  On a product of positive semidefinite matrices , 1999 .

[22]  Aaron Roth,et al.  A learning theory approach to non-interactive database privacy , 2008, STOC.

[23]  P. Massart,et al.  Adaptive estimation of a quadratic functional by model selection , 2000 .

[24]  Fang Liu,et al.  Generalized Gaussian Mechanism for Differential Privacy , 2016, IEEE Transactions on Knowledge and Data Engineering.

[25]  Mohammad Arashi,et al.  On Conditional Applications of Matrix Variate Normal Distribution , 2010 .

[26]  Michael J. Klass,et al.  The Multidimensional Central Limit Theorem for Arrays Normed by Affine Transformations , 1981 .

[27]  Li Tan,et al.  Digital Signal Processing: Fundamentals and Applications , 2013 .

[28]  Dan Suciu,et al.  Boosting the accuracy of differentially private histograms through consistency , 2009, Proc. VLDB Endow..

[29]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[30]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[31]  S. Kung Kernel Methods and Machine Learning , 2014 .

[32]  Ninghui Li,et al.  Understanding Hierarchical Methods for Differentially Private Histograms , 2013, Proc. VLDB Endow..

[33]  P. Dutilleul The mle algorithm for the matrix normal distribution , 1999 .

[34]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[35]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[36]  Bobbi Jo Broxson The Kronecker Product , 2006 .

[37]  Aaron Roth,et al.  Beyond worst-case analysis in private singular vector computation , 2012, STOC '13.

[38]  Sharon Goldberg,et al.  Calibrating Data to Sensitivity in Private Data Analysis , 2012, Proc. VLDB Endow..

[39]  D. Ayres-de- Campos,et al.  SisPorto 2.0: a program for automated analysis of cardiotocograms. , 2000, The Journal of maternal-fetal medicine.

[40]  T. Varga,et al.  Characterization of matrix variate normal distributions , 1992 .

[41]  R. Bloigu,et al.  Effect of moderate alcohol consumption on liver enzymes increases with increasing body mass index. , 2008, The American journal of clinical nutrition.

[42]  Katrina Ligett,et al.  A Simple and Practical Algorithm for Differentially Private Data Release , 2010, NIPS.

[43]  Ninghui Li,et al.  Understanding the Sparse Vector Technique for Differential Privacy , 2016, Proc. VLDB Endow..

[44]  Robert M. Bell,et al.  The BellKor 2008 Solution to the Netflix Prize , 2008 .

[45]  Kunal Talwar,et al.  On differentially private low rank approximation , 2013, SODA.

[46]  White,et al.  Density matrix formulation for quantum renormalization groups. , 1992, Physical review letters.

[47]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[48]  Gordon F. Royle,et al.  Algebraic Graph Theory , 2001, Graduate texts in mathematics.

[49]  Calyampudi R. Rao The use and interpretation of principal component analysis in applied research , 1964 .

[50]  Ashwin Machanavajjhala,et al.  Principled Evaluation of Differentially Private Algorithms using DPBench , 2015, SIGMOD Conference.

[51]  Ninghui Li,et al.  Differentially private grids for geospatial data , 2012, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[52]  Avrim Blum,et al.  The Johnson-Lindenstrauss Transform Itself Preserves Differential Privacy , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[53]  Andrew McGregor,et al.  Optimizing linear counting queries under differential privacy , 2009, PODS.

[54]  Fang Liu,et al.  Model-based Differentially Private Data Synthesis and Statistical Inference in Multiple Synthetic Datasets , 2016, Trans. Data Priv..

[55]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[56]  Charles R. Johnson,et al.  Topics in Matrix Analysis , 1991 .

[57]  Davide Bacciu,et al.  An experimental characterization of reservoir computing in ambient assisted living applications , 2013, Neural Computing and Applications.

[58]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[59]  Klaus Jansen,et al.  Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques , 2006, Lecture Notes in Computer Science.

[60]  Ju Ren,et al.  DPPro: Differentially Private High-Dimensional Data Release via Random Projection , 2017, IEEE Transactions on Information Forensics and Security.

[61]  Li Zhang,et al.  Analyze gauss: optimal bounds for privacy-preserving principal component analysis , 2014, STOC.

[62]  Yin Yang,et al.  Low-Rank Mechanism: Optimizing Batch Queries under Differential Privacy , 2012, Proc. VLDB Endow..

[63]  Anand D. Sarwate,et al.  Near-optimal Differentially Private Principal Components , 2012, NIPS.

[64]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[65]  Oscar Rojo,et al.  Bounds for the spectral radius and the largest singular value , 1998 .

[66]  Ilya Mironov,et al.  Differentially private recommender systems: building privacy into the net , 2009, KDD.

[67]  Jalaj Upadhyay,et al.  Randomness Efficient Fast-Johnson-Lindenstrauss Transform with Applications in Differential Privacy and Compressed Sensing , 2014, 1410.2470.

[68]  Yin Yang,et al.  Compressive mechanism: utilizing sparse representation in differential privacy , 2011, WPES.

[69]  James McDermott,et al.  Diagnosing a disorder in a classification benchmark , 2016, Pattern Recognit. Lett..

[70]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[71]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[72]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[73]  Dawn Xiaodong Song,et al.  Practical Differential Privacy for SQL Queries Using Elastic Sensitivity , 2017, ArXiv.

[74]  D. E. Muller A method for solving algebraic equations using an automatic computer , 1956 .

[75]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[76]  Luis Filipe Coelho Antunes,et al.  Clustering Fetal Heart Rate Tracings by Compression , 2006, 19th IEEE Symposium on Computer-Based Medical Systems (CBMS'06).

[77]  Gerome Miklau,et al.  An Adaptive Mechanism for Accurate Query Answering under Differential Privacy , 2012, Proc. VLDB Endow..

[78]  Yin Yang,et al.  Differentially private histogram publication , 2012, The VLDB Journal.

[79]  Don H. Johnson,et al.  Signal-to-noise ratio , 2006, Scholarpedia.

[80]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[81]  Truc T. Nguyen A Note on Matrix Variate Normal Distribution , 1997 .

[82]  Dawn Xiaodong Song,et al.  Towards Practical Differential Privacy for SQL Queries , 2017, Proc. VLDB Endow..

[83]  D. Waal,et al.  Matrix‐Valued Distributions , 2006 .

[84]  Jianliang Xu,et al.  Towards Accurate Histogram Publication under Differential Privacy , 2014, SDM.

[85]  Xiaoqian Jiang,et al.  Differential-Private Data Publishing Through Component Analysis , 2013, Trans. Data Priv..

[86]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[87]  Divesh Srivastava,et al.  Differentially Private Spatial Decompositions , 2011, 2012 IEEE 28th International Conference on Data Engineering.

[88]  Claude Castelluccia,et al.  Differentially Private Histogram Publishing through Lossy Compression , 2012, 2012 IEEE 12th International Conference on Data Mining.

[89]  Larry A. Wasserman,et al.  Differential privacy with compression , 2009, 2009 IEEE International Symposium on Information Theory.

[90]  Thomas Ertl,et al.  Computer Graphics - Principles and Practice, 3rd Edition , 2014 .

[91]  A. Dawid Some matrix-variate distribution theory: Notational considerations and a Bayesian application , 1981 .

[92]  Ninghui Li,et al.  Differentially Private Publishing of High-dimensional Data Using Sensitivity Control , 2015, AsiaCCS.

[93]  M. Carter Computer graphics: Principles and practice , 1997 .

[94]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[95]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[96]  Yue Wang,et al.  A Data- and Workload-Aware Algorithm for Range Queries Under Differential Privacy , 2014, ArXiv.

[97]  Carl D. Meyer,et al.  Matrix Analysis and Applied Linear Algebra , 2000 .

[98]  Nina Mishra,et al.  Privacy via the Johnson-Lindenstrauss Transform , 2012, J. Priv. Confidentiality.

[99]  Jalaj Upadhyay,et al.  Circulant Matrices and Differential Privacy , 2014, IACR Cryptol. ePrint Arch..