Formal Privacy for Functional Data with Gaussian Perturbations

Motivated by the rapid rise in statistical tools in Functional Data Analysis, we consider the Gaussian mechanism for achieving differential privacy with parameter estimates taking values in a, potentially infinite-dimensional, separable Banach space. Using classic results from probability theory, we show how densities over function spaces can be utilized to achieve the desired differential privacy bounds. This extends prior results of Hall et al (2013) to a much broader class of statistical estimates and summaries, including "path level" summaries, nonlinear functionals, and full function releases. By focusing on Banach spaces, we provide a deeper picture of the challenges for privacy with complex data, especially the role regularization plays in balancing utility and privacy. Using an application to penalized smoothing, we explicitly highlight this balance in the context of mean function estimation. Simulations and an application to diffusion tensor imaging are briefly presented, with extensive additions included in a supplement.

[1]  Colin Combe,et al.  Privacy, Big Data, and the Public Good: Frameworks for Engagement , 2015 .

[2]  Jacob Feldman,et al.  Equivalence and perpendicularity of Gaussian processes , 1958 .

[3]  Mário S. Alvim,et al.  Metric-based local differential privacy for statistical applications , 2018, ArXiv.

[4]  Daniel Kifer,et al.  Private Convex Empirical Risk Minimization and High-dimensional Regression , 2012, COLT 2012.

[5]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[6]  Dale E. Varberg,et al.  On equivalence of Gaussian measures , 1961 .

[7]  P. Hall,et al.  Properties of principal component methods for functional and longitudinal data analysis , 2006, math/0608022.

[8]  Y. Rozanov On the Density of One Gaussian Measure with Respect to Another , 1962 .

[9]  Eun Yong Kang,et al.  Identification of individuals by trait prediction using whole-genome sequencing data , 2017, Proceedings of the National Academy of Sciences.

[10]  Roger Woodard,et al.  Interpolation of Spatial Data: Some Theory for Kriging , 1999, Technometrics.

[11]  W. T. Martin,et al.  Transformations of Wiener integrals under a general class of linear transformations , 1945 .

[12]  A. Cuevas A partial overview of the theory of statistics with functional data , 2014 .

[13]  Enea G. Bongiorno,et al.  Classification methods for Hilbert data based on surrogate density , 2015, Comput. Stat. Data Anal..

[14]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[15]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[16]  Matthew Reimherr,et al.  Manifold Data Analysis with Applications to High-Frequency 3D Imaging , 2017, 1710.01619.

[17]  Jing Lei,et al.  Differentially private model selection with penalized and constrained likelihood , 2016, 1607.04204.

[18]  W. T. Martin,et al.  The behavior of measure and measurability under change of scale in Wiener space , 1947 .

[19]  P. Hall,et al.  Defining probability density for a distribution of random functions , 2010, 1002.4931.

[20]  C. Radhakrishna Rao,et al.  DISCRIMINATION OF GAUSSIAN PROCESSES , 1965 .

[21]  Yu. A. Rozanov On Probability Measures in Functional Spaces Corresponding to Stationary Gaussian Processes , 1964 .

[22]  Ton de Waal,et al.  Statistical Disclosure Control in Practice , 1996 .

[23]  L. Wasserman,et al.  A Statistical Framework for Differential Privacy , 2008, 0811.2501.

[24]  Piotr Kokoszka,et al.  Inference for Functional Data with Applications , 2012 .

[25]  P. Kokoszka,et al.  Introduction to Functional Data Analysis , 2017 .

[26]  A. V. Skorohod On the densities of probability measures in functional spaces , 1967 .

[27]  P. Spreij Probability and Measure , 1996 .

[28]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[29]  I. V. Girsanov On Transforming a Certain Class of Stochastic Processes by Absolutely Continuous Substitution of Measures , 1960 .

[30]  Larry A. Wasserman,et al.  Differential privacy for functions and functional data , 2012, J. Mach. Learn. Res..

[31]  Dale E. Varberg On Gaussian Measures Equivalent to Wiener Measure II. , 1966 .

[32]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[33]  Stephen E. Fienberg,et al.  Statistical Disclosure Limitation for~Data~Access , 2018, Encyclopedia of Database Systems.

[34]  Josep Domingo-Ferrer,et al.  Statistical Disclosure Control , 2012 .

[35]  J. Kulynych,et al.  Legal and ethical issues in neuroimaging research: human subjects protection, medical privacy, and the public communication of research results , 2002, Brain and Cognition.

[36]  Neil D. Lawrence,et al.  Differentially Private Regression with Gaussian Processes , 2018, AISTATS.

[37]  H. Müller,et al.  Optimal Bayes classifiers for functional data and density ratios , 2016, 1605.03707.

[38]  Benjamin I. P. Rubinstein,et al.  The Bernstein Mechanism: Function Release under Differential Privacy , 2017, AAAI.

[39]  L. Shepp Radon-Nikodym Derivatives of Gaussian Measures , 1966 .

[40]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[41]  Ciprian M. Crainiceanu,et al.  refund: Regression with Functional Data , 2013 .

[42]  Yaniv Erlich,et al.  Routes for breaching and protecting genetic privacy , 2013 .

[43]  T. Auton Applied Functional Data Analysis: Methods and Case Studies , 2004 .

[44]  R. Laha Probability Theory , 1979 .

[45]  K. Hao,et al.  Bayesian method to predict individual SNP genotypes from gene expression data , 2012, Nature Genetics.

[46]  F. Ferraty,et al.  The Oxford Handbook of Functional Data Analysis , 2011, Oxford Handbooks Online.

[47]  Dale E. Varberg On Gaussian measures equivalent to Wiener measure , 1964 .

[48]  José R. Berrendero,et al.  On the Use of Reproducing Kernel Hilbert Spaces in Functional Classification , 2015, Journal of the American Statistical Association.

[49]  Stephen E. Fienberg,et al.  Data Privacy and Confidentiality , 2011, International Encyclopedia of Statistical Science.

[50]  G. Baxter,et al.  A strong limit theorem for Gaussian processes , 1956 .

[51]  L. Shepp,et al.  THE SINGULARITY OF GAUSSIAN MEASURES IN FUNCTION SPACE. , 1964, Proceedings of the National Academy of Sciences of the United States of America.

[52]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[53]  J. Radon Theorie und Anwendungen der absolut additiven Mengenfunktionen , 1913 .