NeuroSCA: Evolving Activation Functions for Side-Channel Analysis

The choice of activation functions can have a significant effect on the performance of a neural network. Although the researchers have been developing novel activation functions, Rectified Linear Unit (ReLU) remains the most common one in practice. This paper shows that evolutionary algorithms can discover new activation functions for side-channel analysis (SCA) that outperform ReLU . Using Genetic Programming (GP), candidate activation functions are defined and explored (neuroevolution). As far as we know, this is the first attempt to develop custom activation functions for SCA. The ASCAD database experiments show this approach is highly effective compared to the state-of-the-art neural network architectures. While the optimal performance is achieved when activation functions are evolved for the particular task, we also observe that these activation functions show the property of generalization and high performance for different SCA scenarios.

[1]  Annelie Heuser,et al.  Intelligent Machine Homicide - Breaking Cryptographic Devices Using Support Vector Machines , 2012, COSADE.

[2]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[3]  Risto Miikkulainen,et al.  Evolutionary optimization of deep learning activation functions , 2020, GECCO.

[4]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[5]  Alexander Asteroth,et al.  Evolving parsimonious networks by mixing activation functions , 2017, GECCO.

[6]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[7]  Annelie Heuser,et al.  The Curse of Class Imbalance and Conflicting Metrics with Machine Learning for Side-channel Evaluations , 2018, IACR Cryptol. ePrint Arch..

[8]  Roland Memisevic,et al.  Zero-bias autoencoders and the benefits of co-adapting features , 2014, ICLR.

[9]  Pankaj Rohatgi,et al.  Template Attacks , 2002, CHES.

[10]  Samy Bengio,et al.  Links between perceptrons, MLPs and SVMs , 2004, ICML.

[11]  Sergey Levine,et al.  Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm , 2017, ICLR.

[12]  Nenghai Yu,et al.  A Novel Evaluation Metric for Deep Learning-Based Side Channel Analysis and Its Extended Application to Imbalanced Data , 2020, IACR Trans. Cryptogr. Hardw. Embed. Syst..

[13]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[14]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[15]  Emmanuel Prouff,et al.  Breaking Cryptographic Implementations Using Deep Learning Techniques , 2016, SPACE.

[16]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[17]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[18]  Diego Klabjan,et al.  Activation Ensembles for Deep Neural Networks , 2017, 2019 IEEE International Conference on Big Data (Big Data).

[19]  Sylvain Guilley,et al.  Side-channel analysis and machine learning: A practical perspective , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[20]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[21]  Marc Parizeau,et al.  DEAP: evolutionary algorithms made easy , 2012, J. Mach. Learn. Res..

[22]  Pankaj Rohatgi,et al.  Introduction to differential power analysis , 2011, Journal of Cryptographic Engineering.

[23]  Lilian Bossuet,et al.  Ranking Loss: Maximizing the Success Rate in Deep Learning Side-Channel Analysis , 2020, IACR Cryptol. ePrint Arch..

[24]  Xin Yao,et al.  Evolutionary design of artificial neural networks with different nodes , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[25]  Stjepan Picek,et al.  I Choose You: Automated Hyperparameter Tuning for Deep Learning-based Side-channel Analysis , 2020, IACR Cryptol. ePrint Arch..

[26]  Norbert Jankowski,et al.  Survey of Neural Transfer Functions , 1999 .

[27]  Emmanuel Prouff,et al.  Deep learning for side-channel analysis and introduction to ASCAD database , 2019, Journal of Cryptographic Engineering.

[28]  Siva Sai Yerubandi,et al.  Differential Power Analysis , 2002 .

[29]  Stjepan Picek,et al.  Strength in Numbers: Improving Generalization with Ensembles in Machine Learning-based Profiled Side-channel Analysis , 2020, IACR Trans. Cryptogr. Hardw. Embed. Syst..

[30]  Alan Hanjalic,et al.  Make Some Noise: Unleashing the Power of Convolutional Neural Networks for Profiled Side-channel Analysis , 2019, IACR Cryptol. ePrint Arch..

[31]  Quoc V. Le,et al.  Searching for Activation Functions , 2018, arXiv.

[32]  Alessandro Rozza,et al.  Learning Combinations of Activation Functions , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[33]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[34]  Frank Hutter,et al.  Initializing Bayesian Hyperparameter Optimization via Meta-Learning , 2015, AAAI.

[35]  Muhammad Shafique,et al.  A Methodology for Automatic Selection of Activation Functions to Design Hybrid Deep Neural Networks , 2018, ArXiv.

[36]  Bart Preneel,et al.  Revisiting a Methodology for Efficient CNN Architectures in Profiling Attacks , 2020, IACR Trans. Cryptogr. Hardw. Embed. Syst..

[37]  Moti Yung,et al.  A Unified Framework for the Analysis of Side-Channel Key Recovery Attacks (extended version) , 2009, IACR Cryptol. ePrint Arch..

[38]  Jean-Jacques Quisquater,et al.  ElectroMagnetic Analysis (EMA): Measures and Counter-Measures for Smart Cards , 2001, E-smart.

[39]  Christophe Pfeifer,et al.  Spread: a new layer for profiled deep-learning side-channel attacks , 2018, IACR Cryptol. ePrint Arch..

[40]  Stjepan Picek,et al.  Reinforcement Learning for Hyperparameter Tuning in Deep Learning-based Side-channel Analysis , 2021, IACR Cryptol. ePrint Arch..

[41]  Qi Tian,et al.  Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Martin Wistuba,et al.  A Survey on Neural Architecture Search , 2019, ArXiv.

[43]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[44]  Stefan Mangard,et al.  Power analysis attacks - revealing the secrets of smart cards , 2007 .

[45]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[46]  Peter M. Roth,et al.  The Quest for the Golden Activation Function , 2018, ArXiv.

[47]  Sylvain Guilley,et al.  Template attack versus Bayes classifier , 2017, Journal of Cryptographic Engineering.

[48]  Risto Miikkulainen,et al.  Improved Training Speed, Accuracy, and Data Utilization Through Loss Function Optimization , 2019, 2020 IEEE Congress on Evolutionary Computation (CEC).

[49]  Emmanuel Prouff,et al.  Convolutional Neural Networks with Data Augmentation Against Jitter-Based Countermeasures - Profiling Attacks Without Pre-processing , 2017, CHES.

[50]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[51]  Lilian Bossuet,et al.  Methodology for Efficient CNN Architectures in Profiling Attacks , 2019, IACR Cryptol. ePrint Arch..

[52]  Paul C. Kocher,et al.  Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems , 1996, CRYPTO.

[53]  Frank Hutter,et al.  Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..