论文信息 - Max-Mahalanobis Linear Discriminant Analysis Networks

Max-Mahalanobis Linear Discriminant Analysis Networks

A deep neural network (DNN) consists of a nonlinear transformation from an input to a feature representation, followed by a common softmax linear classifier. Though many efforts have been devoted to designing a proper architecture for nonlinear transformation, little investigation has been done on the classifier part. In this paper, we show that a properly designed classifier can improve robustness to adversarial attacks and lead to better prediction results. Specifically, we define a Max-Mahalanobis distribution (MMD) and theoretically show that if the input distributes as a MMD, the linear discriminant analysis (LDA) classifier will have the best robustness to adversarial examples. We further propose a novel Max-Mahalanobis linear discriminant analysis (MM-LDA) network, which explicitly maps a complicated data distribution in the input space to a MMD in the latent feature space and then applies LDA to make predictions. Our results demonstrate that the MM-LDA networks are significantly more robust to adversarial attacks, and have better performance in class-biased classification.

[1] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[2] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Pietro Perona,et al. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[4] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Ananthram Swami,et al. The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[6] Yichuan Tang,et al. Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[7] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8] Samy Bengio,et al. Adversarial Machine Learning at Scale , 2016, ICLR.

[9] David Wagner,et al. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[10] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[11] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[12] Jun Zhu,et al. Towards Robust Detection of Adversarial Examples , 2017, NeurIPS.

[13] Fabian Bamberg,et al. A novel objective function based on a generalized kelly criterion for deep learning , 2017, 2017 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA).

[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[16] Jason Yosinski,et al. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[18] Luc Van Gool,et al. DEX: Deep EXpectation of Apparent Age from a Single Image , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[19] Yann LeCun,et al. Large-scale Learning with SVM and Convolutional for Generic Object Categorization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.

[21] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[22] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[23] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[24] Quoc V. Le,et al. Tiled convolutional neural networks , 2010, NIPS.

[25] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.

[26] Xiaoou Tang,et al. Discriminative Sparse Neighbor Approximation for Imbalanced Learning , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[27] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[28] B. Efron. The Efficiency of Logistic Regression Compared to Normal Discriminant Analysis , 1975 .

[29] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[30] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[31] Andreas Geiger,et al. Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32] Seyed-Mohsen Moosavi-Dezfooli,et al. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Dan Boneh,et al. Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[34] Meng Yang,et al. Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.

[35] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.