SARAF: Searching for Adversarial Robust Activation Functions

Convolutional Neural Networks (CNNs) have received great attention in the computer vision domain. However, CNNs are vulnerable to adversarial attacks, which are manipulations of input data that are imperceptible to humans but can fool the network. Several studies tried to address this issue, which can be divided into two categories: (i) training the network with adversarial examples, and (ii) optimizing the network architecture and/or hyperparameters. Although adversarial training is a sufficient defense mechanism, they suffer from requiring a large volume of training samples to cover a wide perturbation bound. Tweaking network activation functions (AFs) has been shown to provide promising results where CNNs suffer from performance loss. However, optimizing network AFs for compensating the negative impacts of adversarial attacks has not been addressed in the literature. This paper proposes the idea of searching for AFs that are robust against adversarial attacks. To this aim, we leverage the Simulated Annealing (SA) algorithm with a fast convergence time. This proposed method is called SARAF. We demonstrate the consistent effectiveness of SARAF by achieving up to 16.92%, 18.3%, and 15.57% accuracy improvement against BIM, FGSM, and PGD adversarial attacks, respectively, over ResNet-18 with ReLU AFs (baseline) trained on CIFAR-10. Meanwhile, SARAF provides a significant search efficiency compared to random search as the optimization baseline.

[1]  Wenju Xu,et al.  Explicitly Increasing Input Information Density for Vision Transformers on Small Datasets , 2022, ArXiv.

[2]  H. Esmaeilzadeh,et al.  FastStereoNet: A Fast Neural Architecture Search for Improving the Inference of Disparity Estimation on Resource-Limited Platforms , 2022, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[3]  Samuel A. Stein,et al.  GAAF: Searching Activation Functions for Binary Neural Networks through Genetic Algorithm , 2022, Tsinghua Science and Technology.

[4]  Behnam Ghavami,et al.  Stealthy Attack on Algorithmic-Protected DNNs via Smart Bit Flipping , 2021, 2022 23rd International Symposium on Quality Electronic Design (ISQED).

[5]  Wee Peng Tay,et al.  Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks , 2021, NeurIPS.

[6]  Sven Gowal,et al.  Improving Robustness using Generated Data , 2021, NeurIPS.

[7]  Prateek Mittal,et al.  Parameterizing Activation Functions for Adversarial Robustness , 2021, 2022 IEEE Security and Privacy Workshops (SPW).

[8]  Lior Rokach,et al.  Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain , 2021, ACM Comput. Surv..

[9]  Timothy A. Mann,et al.  Fixing Data Augmentation to Improve Adversarial Robustness , 2021, ArXiv.

[10]  Dennis Goeckel,et al.  Robust Adversarial Attacks Against DNN-Based Wireless Communication Systems , 2021, CCS.

[11]  Mikael Sjödin,et al.  DenseDisp: Resource-Aware Disparity Map Estimation by Compressing Siamese Neural Architecture , 2020, 2020 IEEE Congress on Evolutionary Computation (CEC).

[12]  Quoc V. Le,et al.  Smooth Adversarial Training , 2020, ArXiv.

[13]  Pierre Baldi,et al.  SPLASH: Learnable Activation Functions for Improving Accuracy and Adversarial Robustness , 2020, Neural Networks.

[14]  Rajiv Kumar,et al.  An intelligent Chatbot using deep learning with Bidirectional RNN and attention model , 2020, Materials Today: Proceedings.

[15]  Risto Miikkulainen,et al.  Discovering Parametric Activation Functions , 2020, Neural Networks.

[16]  Masoud Daneshtalab,et al.  DeepMaker: A multi-objective optimization framework for deep neural networks in embedded systems , 2020, Microprocess. Microsystems.

[17]  R. Miikkulainen,et al.  Evolutionary optimization of deep learning activation functions , 2020, GECCO.

[18]  Kay C. Wiese,et al.  EvoDNN - An Evolutionary Deep Neural Network with Heterogeneous Activation Functions , 2019, 2019 IEEE Congress on Evolutionary Computation (CEC).

[19]  Arash Rahnama,et al.  Robust Design of Deep Neural Networks Against Adversarial Attacks Based on Lyapunov Theory , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Alexandre Lacoste,et al.  Quantifying the Carbon Emissions of Machine Learning , 2019, ArXiv.

[21]  Masoud Daneshtalab,et al.  NeuroPower: Designing Energy Efficient Convolutional Neural Network Architecture for Embedded Systems , 2019, ICANN.

[22]  Marius Lindauer,et al.  Best Practices for Scientific Research on Neural Architecture Search , 2019, ArXiv.

[23]  Masoud Daneshtalab,et al.  Multi-objective Optimization of Real-Time Task Scheduling Problem for Distributed Environments , 2019, ECBS.

[24]  Masoud Daneshtalab,et al.  TOT-Net: An Endeavor Toward Optimizing Ternary Neural Networks , 2019, 2019 22nd Euromicro Conference on Digital System Design (DSD).

[25]  Mark Lee,et al.  On Physical Adversarial Patches for Object Detection , 2019, ArXiv.

[26]  Andrew L. Beam,et al.  Adversarial attacks on medical machine learning , 2019, Science.

[27]  Ameet Talwalkar,et al.  Random Search and Reproducibility for Neural Architecture Search , 2019, UAI.

[28]  Cho-Jui Hsieh,et al.  Efficient Neural Network Robustness Certification with General Activation Functions , 2018, NeurIPS.

[29]  Debdeep Mukhopadhyay,et al.  Adversarial Attacks and Defences: A Survey , 2018, ArXiv.

[30]  Jiliang Zhang,et al.  Adversarial Examples: Opportunities and Challenges , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Xindong Wu,et al.  Object Detection With Deep Learning: A Review , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[32]  Matthias Bethge,et al.  Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models , 2017, ICLR.

[33]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Valentina Zantedeschi,et al.  Efficient Defenses Against Adversarial Attacks , 2017, AISec@CCS.

[35]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[36]  Sandy H. Huang,et al.  Adversarial Attacks on Neural Network Policies , 2017, ICLR.

[37]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[38]  Kevin Gimpel,et al.  Gaussian Error Linear Units (GELUs) , 2016, 1606.08415.

[39]  Alex Zhavoronkov,et al.  Applications of Deep Learning in Biomedicine. , 2016, Molecular pharmaceutics.

[40]  Yunchao Wei,et al.  Deep Learning with S-Shaped Rectified Linear Activation Units , 2015, AAAI.

[41]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Michael S. Gashler,et al.  A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks , 2015, 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K).

[44]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[45]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[46]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[47]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[48]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[49]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[50]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[51]  Richard Hans Robert Hahnloser,et al.  Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit , 2000, Nature.

[52]  Mirko Krivánek,et al.  Simulated Annealing: A Proof of Convergence , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[54]  J. Z. Kolter,et al.  A Branch and Bound Framework for Stronger Adversarial Attacks of ReLU Networks , 2022, ICML.

[55]  G. Yen,et al.  Particle Swarm Optimization for Compact Neural Architecture Search for Image Classification , 2023, IEEE Transactions on Evolutionary Computation.

[56]  Quoc V. Le,et al.  Searching for Activation Functions , 2018, arXiv.

[57]  Ronald M. Summers,et al.  Deep Learning in Medical Imaging: Overview and Future Promise of an Exciting New Technique , 2016 .

[58]  Bernardete Ribeiro,et al.  Incorporate Cost Matrix into Learning Vector Quantization Modeling: a Comparative Study of Genetic Algorithm, Simulated Annealing and Particle Swarm Optimization , 2011 .

[59]  Jyotsna Sengupta,et al.  Comparative Analysis of Simulated Annealingand Tabu Search Channel Allocation Algorithms , 2009 .

[60]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[61]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .