Sharp phase transitions for exact support recovery under local differential privacy.

We address the problem of variable selection in the Gaussian mean model in $\mathbb{R}^d$ under the additional constraint that only privatised data are available for inference. For this purpose, we adopt a recent generalisation of classical minimax theory to the framework of local $\alpha$-differential privacy. We provide lower and upper bounds on the rate of convergence for the expected Hamming loss over classes of at most $s$-sparse vectors whose non-zero coordinates are separated from $0$ by a constant $a>0$. As corollaries, we derive necessary and sufficient conditions (up to log factors) for exact recovery and for almost full recovery. When we restrict our attention to non-interactive mechanisms that act independently on each coordinate our lower bound shows that, contrary to the non-private setting, both exact and almost full recovery are impossible whatever the value of $a$ in the high-dimensional regime such that $n \alpha^2/ d^2\lesssim 1$. However, in the regime $n\alpha^2/d^2\gg \log(n\alpha^2/d^2)\log(d)$ we can exhibit a sharp critical value $a^*$ (up to a logarithmic factor) such that exact and almost full recovery are possible for all $a\gg a^*$ and impossible for $a\leq a^*$. We show that these results can be improved when allowing for all non-interactive (that act globally on all coordinates) locally $\alpha-$differentially private mechanisms in the sense that phase transitions occur at lower levels.

[1]  Stephen E. Fienberg,et al.  Testing Statistical Hypotheses , 2005 .

[2]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[3]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[4]  L. Wasserman,et al.  A Statistical Framework for Differential Privacy , 2008, 0811.2501.

[5]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[6]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[7]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2016, J. Priv. Confidentiality.

[8]  Martin J. Wainwright,et al.  Minimax Optimal Procedures for Locally Private Estimation , 2016, ArXiv.

[9]  Jonathan Ullman,et al.  The Price of Selection in Differential Privacy , 2017, COLT.

[10]  Thomas Steinke,et al.  Tight Lower Bounds for Differentially Private Selection , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[11]  A. Tsybakov,et al.  Variable selection with Hamming loss , 2015, The Annals of Statistics.

[12]  M. Ndaoud Sharp optimal recovery in the Two Component Gaussian Mixture Model , 2018, 1812.08078.

[13]  Jonathan Ullman,et al.  Tight Lower Bounds for Locally Differentially Private Selection , 2018, ArXiv.

[14]  Amandine Dubois,et al.  Local differential privacy: Elbow effect in optimal density estimation and adaptation over Besov ellipsoids , 2019, Bernoulli.

[15]  Cristina Butucea,et al.  Locally private non-asymptotic testing of discrete distributions is faster using interactive mechanisms , 2020, NeurIPS.

[16]  G. A. Young,et al.  High‐dimensional Statistics: A Non‐asymptotic Viewpoint, Martin J.Wainwright, Cambridge University Press, 2019, xvii 552 pages, £57.99, hardback ISBN: 978‐1‐1084‐9802‐9 , 2020, International Statistical Review.

[17]  Alexandre B. Tsybakov,et al.  Optimal Variable Selection and Adaptive Noisy Compressed Sensing , 2018, IEEE Transactions on Information Theory.

[18]  Angelika Rohde,et al.  Geometrizing rates of convergence under local differential privacy constraints , 2020 .

[19]  Interactive versus non-interactive locally differentially private estimation: Two elbows for the quadratic functional , 2020, 2003.04773.