On the power of adaptivity in statistical adversaries

We study a fundamental question concerning adversarial noise models in statistical problems where the algorithm receives i.i.d. draws from a distribution D. The definitions of these adversaries specify the type of allowable corruptions (noise model) as well as when these corruptions can be made (adaptivity); the latter differentiates between oblivious adversaries that can only corrupt the distribution D and adaptive adversaries that can have their corruptions depend on the specific sample S that is drawn from D. In this work, we investigate whether oblivious adversaries are effectively equivalent to adaptive adversaries, across all noise models studied in the literature. Specifically, can the behavior of an algorithm A in the presence of oblivious adversaries always be well-approximated by that of an algorithm A′ in the presence of adaptive adversaries? Our first result shows that this is indeed the case for the broad class of statistical query algorithms, under all reasonable noise models. We then show that in the specific case of additive noise, this equivalence holds for all algorithms. Finally, we map out an approach towards proving this statement in its fullest generality, for all algorithms and under all reasonable noise models.

[1]  Ming Li,et al.  Learning in the presence of malicious errors , 1993, STOC '88.

[2]  Jonathan Ullman,et al.  Preventing False Discovery in Interactive Data Analysis Is Hard , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[3]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[4]  Eyal Kushilevitz,et al.  PAC learning with nasty noise , 1999, Theor. Comput. Sci..

[5]  F. Hampel A General Qualitative Definition of Robustness , 1971 .

[6]  Santosh S. Vempala,et al.  Agnostic Estimation of Mean and Covariance , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[7]  Leslie G. Valiant,et al.  Learning Disjunction of Conjunctions , 1985, IJCAI.

[8]  Santosh S. Vempala,et al.  Statistical Algorithms and a Lower Bound for Detecting Planted Cliques , 2012, J. ACM.

[9]  Santosh S. Vempala,et al.  Statistical Query Algorithms for Mean Vector Estimation and Stochastic Convex Optimization , 2015, SODA.

[10]  Raef Bassily,et al.  Algorithmic stability for adaptive data analysis , 2015, STOC.

[11]  Lev Reyzin Statistical Queries and Statistical Algorithms: Foundations and Applications , 2020, ArXiv.

[12]  J. Tukey Mathematics and the Picturing of Data , 1975 .

[13]  Daniel M. Kane,et al.  Robust Estimators in High Dimensions without the Computational Intractability , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[14]  Banghua Zhu,et al.  Generalized Resilience and Robust Statistics , 2019, The Annals of Statistics.

[15]  Michael Kearns,et al.  Efficient noise-tolerant learning from statistical queries , 1993, STOC.

[16]  Thomas Steinke,et al.  Interactive fingerprinting codes and the hardness of preventing false discovery , 2014, 2016 Information Theory and Applications Workshop (ITA).

[17]  Gregory Valiant,et al.  Learning from untrusted data , 2016, STOC.

[18]  R. Schapire,et al.  Toward efficient agnostic learning , 1992, COLT '92.

[19]  Daniel M. Kane,et al.  Recent Advances in Algorithmic High-Dimensional Robust Statistics , 2019, ArXiv.

[20]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[21]  Toniann Pitassi,et al.  Preserving Statistical Validity in Adaptive Data Analysis , 2014, STOC.

[22]  Tim Roughgarden,et al.  Smoothed Analysis with Adaptive Adversaries , 2021, 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS).