Maximum Mean Discrepancy is Aware of Adversarial Attacks

The maximum mean discrepancy (MMD) test, as a representative two-sample test, could in principle detect any distributional discrepancy between two datasets. However, it has been shown that MMD is unaware of adversarial attacks---MMD failed to detect the discrepancy between natural data and adversarial data generated by adversarial attacks. Given this phenomenon, we raise a question: are natural and adversarial data really from different distributions but previous use of MMD on the purpose missed some key factors? The answer is affirmative. We find the previous use missed three factors and accordingly we propose three components: (a) Gaussian kernel has limited representation power, and we replace it with a novel semantic-aware deep kernel; (b) test power of MMD was neglected, and we maximize it in order to optimize our deep kernel; (c) adversarial data may be non-independent, and to this end we apply wild bootstrap for validity of the test power. By taking care of the three factors, we validate that MMD is aware of adversarial attacks, which lights up a novel road for adversarial attack detection based on two-sample tests.

[1]  A. Odén,et al.  Arguments for Fisher's Permutation Test , 1975 .

[2]  K. Yoshihara Limiting behavior of U-statistics for stationary, absolutely regular processes , 1976 .

[3]  M. Denker,et al.  On U-statistics and v. mise’ statistics for weakly dependent processes , 1983 .

[4]  S. Rachev The Monge–Kantorovich Mass Transference Problem and Its Stochastic Applications , 1985 .

[5]  A. Müller Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.

[6]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[7]  Hans-Peter Kriegel,et al.  Integrating structured biological data by Kernel Maximum Mean Discrepancy , 2006, ISMB.

[8]  Le Song,et al.  A Kernel Statistical Test of Independence , 2007, NIPS.

[9]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[10]  Kenji Fukumizu,et al.  On integral probability metrics, φ-divergences and binary classification , 2009, 0901.2698.

[11]  X. Shao,et al.  The Dependent Wild Bootstrap , 2010 .

[12]  Bernhard Schölkopf,et al.  Hilbert Space Embeddings and Metrics on Probability Measures , 2009, J. Mach. Learn. Res..

[13]  Takafumi Kanamori,et al.  Least-squares two-sample test , 2011, Neural Networks.

[14]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[15]  Takafumi Kanamori,et al.  $f$ -Divergence Estimation and Two-Sample Homogeneity Test Under Semiparametric Density-Ratio Models , 2010, IEEE Transactions on Information Theory.

[16]  Sivaraman Balakrishnan,et al.  Optimal kernel choice for large-scale two-sample tests , 2012, NIPS.

[17]  Wojciech Zaremba,et al.  B-tests: Low Variance Kernel Two-Sample Tests , 2013, NIPS 2013.

[18]  Jerome H. Friedman,et al.  A New Graph-Based Two-Sample Test for Multivariate and Object Data , 2013, 1307.6294.

[19]  Maria L. Rizzo,et al.  Energy statistics: A class of statistics based on distances , 2013 .

[20]  Takafumi Kanamori,et al.  Relative Density-Ratio Estimation for Robust Distribution Comparison , 2011, Neural Computation.

[21]  Anne Leucht,et al.  Dependent wild bootstrap for degenerate U- and V-statistics , 2013, J. Multivar. Anal..

[22]  Arthur Gretton,et al.  A Wild Bootstrap for Degenerate Kernel Tests , 2014, NIPS.

[23]  Yevgeniy Vorobeychik,et al.  Feature Cross-Substitution in Adversarial Classification , 2014, NIPS.

[24]  Yevgeniy Vorobeychik,et al.  Scalable Optimization of Randomized Operational Decisions in Adversarial Classification Settings , 2015, AISTATS.

[25]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[26]  Arthur Gretton,et al.  Fast Two-Sample Testing with Analytic Representations of Probability Measures , 2015, NIPS.

[27]  Ruth Heller,et al.  Multivariate tests of association based on univariate tests , 2016, NIPS.

[28]  Bernhard Schölkopf,et al.  Domain Adaptation with Conditional Transferable Components , 2016, ICML.

[29]  Arthur Gretton,et al.  Interpretable Distribution Features with Maximum Testing Power , 2016, NIPS.

[30]  Patrick D. McDaniel,et al.  On the (Statistical) Detection of Adversarial Examples , 2017, ArXiv.

[31]  Marco Cuturi,et al.  On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests , 2015, Entropy.

[32]  Ryan R. Curtin,et al.  Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[33]  Jan Hendrik Metzen,et al.  On Detecting Adversarial Perturbations , 2017, ICLR.

[34]  Xin Li,et al.  Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Ulrike von Luxburg,et al.  Two-Sample Tests for Large Random Graphs Using Network Statistics , 2017, COLT.

[36]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[37]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[38]  David Lopez-Paz,et al.  Revisiting Classifier Two-Sample Tests , 2016, ICLR.

[39]  Alexander J. Smola,et al.  Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy , 2016, ICLR.

[40]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[41]  Tara Javidi,et al.  DeepFense: Online Accelerated Defense Against Adversarial Deep Learning , 2017, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[42]  Xiaodong Wang,et al.  Fully Distributed Sequential Hypothesis Testing: Algorithms and Asymptotic Analyses , 2018, IEEE Transactions on Information Theory.

[43]  Arthur Gretton,et al.  On gradient regularizers for MMD GANs , 2018, NeurIPS.

[44]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[45]  Stefano Ermon,et al.  Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance , 2018, NeurIPS.

[46]  Arthur Gretton,et al.  Demystifying MMD GANs , 2018, ICLR.

[47]  Ulrike von Luxburg,et al.  Practical methods for graph two-sample testing , 2018, NeurIPS.

[48]  James Bailey,et al.  Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality , 2018, ICLR.

[49]  Huan Xu,et al.  Robust Hypothesis Testing Using Wasserstein Uncertainty Sets , 2018, NeurIPS.

[50]  Kibok Lee,et al.  A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks , 2018, NeurIPS.

[51]  Jaime G. Carbonell,et al.  Data-Driven Approach to Multiple-Source Domain Adaptation , 2019, AISTATS.

[52]  Dougal J. Sutherland Unbiased estimators for the variance of MMD estimators , 2019, ArXiv.

[53]  Arthur Gretton,et al.  Learning deep kernels for exponential family densities , 2018, ICML.

[54]  Jorn-Henrik Jacobsen,et al.  Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations , 2020, ICML.

[55]  Matthias Hein,et al.  Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks , 2020, ICML.

[56]  Two-sample Testing Using Deep Learning , 2019, AISTATS.

[57]  Mohan S. Kankanhalli,et al.  Attacks Which Do Not Kill Training Make Adversarial Learning Stronger , 2020, ICML.

[58]  Feng Liu,et al.  Learning Deep Kernels for Non-Parametric Two-Sample Tests , 2020, ICML.

[59]  Tara Javidi,et al.  CuRTAIL: ChaRacterizing and Thwarting AdversarIal Deep Learning , 2017, IEEE Transactions on Dependable and Secure Computing.