InfoNEAT: Information Theory-based NeuroEvolution of Augmenting Topologies for Side-channel Analysis

Profiled side-channel analysis (SCA) leverages leakage from cryptographic implementations to extract the secret key. When combined with advanced methods in neural networks (NNs), profiled SCA can successfully attack even those crypto-cores assumed to be protected against SCA. Despite the rise in the number of studies devoted to NN-based SCA, existing methods could not systematically address the challenges involved in the NN-based SCA. A range of questions has remained unanswered, namely: how to choose a NN with an adequate size, how to tune the NN's hyperparameters, when to stop the training, and how to explain the performance of the NN model in quantitative terms, in the context of SCA. Our proposed approach, "InfoNEAT," tackles these issues in a natural way. InfoNEAT relies on the concept of evolution of NNs (both the network architecture and parameters, so-called neuroevolution), enhanced by information-theoretic metrics to guide the evolution, halt it with a novel stopping criteria, and improve time-complexity and memory footprint. The performance of InfoNEAT is evaluated by applying it to publicly available datasets composed of real side-channel measurements. In addition to the considerable advantages regarding the automated configuration of NNs, InfoNEAT demonstrates significant improvements over other approaches including a reduction in the number of epochs and width of the NN (i.e., the number of nodes in a layer) by factors of at least 1.25 and 6.66, respectively. According to our assessment and on the basis of our results, this is indeed achieved without any deterioration in the performance of SCA compared to the state-of-the-art NN-based methods.

[1]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[2]  Emmanuel Prouff,et al.  Breaking Cryptographic Implementations Using Deep Learning Techniques , 2016, SPACE.

[3]  Lejla Batina,et al.  On the Performance of Multilayer Perceptron in Profiling Side-channel Analysis , 2019, IACR Cryptol. ePrint Arch..

[4]  Stjepan Picek,et al.  Learning when to stop: a mutual information approach to fight overfitting in profiled side-channel analysis , 2020, IACR Cryptol. ePrint Arch..

[5]  Jorge Nocedal,et al.  On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.

[6]  Alessandro Rozza,et al.  DANCo: An intrinsic dimensionality estimator exploiting angle and norm concentration , 2014, Pattern Recognit..

[7]  Stjepan Picek,et al.  Kilroy was here: The First Step Towards Explainability of Neural Networks in Profiled Side-channel Analysis , 2019, IACR Cryptol. ePrint Arch..

[8]  Robert Jenssen,et al.  Understanding Convolutional Neural Networks With Information Theory: An Initial Exploration , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Alexander Asteroth,et al.  Evolving parsimonious networks by mixing activation functions , 2017, GECCO.

[10]  Lejla Batina,et al.  One trace is all it takes: Machine Learning-based Side-channel Attack on EdDSA , 2019, IACR Cryptol. ePrint Arch..

[11]  F. Ganji,et al.  Physical security in the post-quantum era , 2021, J. Cryptogr. Eng..

[12]  Emmanuel Prouff,et al.  Convolutional Neural Networks with Data Augmentation Against Jitter-Based Countermeasures - Profiling Attacks Without Pre-processing , 2017, CHES.

[13]  Alessandro Laio,et al.  Intrinsic dimension of data representations in deep neural networks , 2019, NeurIPS.

[14]  Alfred O. Hero,et al.  On Local Intrinsic Dimension Estimation and Its Applications , 2010, IEEE Transactions on Signal Processing.

[15]  Ahmed Aly,et al.  Optimizing Deep Neural Networks with Multiple Search Neuroevolution , 2019, ArXiv.

[16]  François-Xavier Standaert,et al.  Univariate side channel attacks and leakage modeling , 2011, Journal of Cryptographic Engineering.

[17]  Stjepan Picek,et al.  Bias-variance Decomposition in Machine Learning-based Side-channel Analysis , 2019, IACR Cryptol. ePrint Arch..

[18]  Cécile Canovas,et al.  A Comprehensive Study of Deep Learning for Side-Channel Analysis , 2019, IACR Cryptol. ePrint Arch..

[19]  Moti Yung,et al.  A Unified Framework for the Analysis of Side-Channel Key Recovery Attacks (extended version) , 2009, IACR Cryptol. ePrint Arch..

[20]  César Hervás-Martínez,et al.  Cooperative coevolution of artificial neural network ensembles for pattern classification , 2005, IEEE Transactions on Evolutionary Computation.

[21]  Kenneth O. Stanley,et al.  Simple Evolutionary Optimization Can Rival Stochastic Gradient Descent in Neural Networks , 2016, GECCO.

[22]  Rajendra Bhatia,et al.  Infinitely Divisible Matrices , 2006, Am. Math. Mon..

[23]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[24]  Robert Jenssen,et al.  Multivariate Extension of Matrix-based Renyi's α-order Entropy Functional , 2020, IEEE transactions on pattern analysis and machine intelligence.

[25]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[26]  S. Picek,et al.  Profiled Side-Channel Analysis in the Efficient Attacker Framework , 2021, CARDIS.

[27]  Cécile Canovas,et al.  Enhancing Dimensionality Reduction Methods for Side-Channel Attacks , 2015, CARDIS.

[28]  Edgar Galván López,et al.  Neuroevolution in Deep Neural Networks: Current Trends and Future Challenges , 2020, IEEE Transactions on Artificial Intelligence.

[29]  Yachen Lin,et al.  Geometric Data Analysis: An Empirical Approach to Dimensionality Reduction and the Study of Patterns , 2002, Technometrics.

[30]  Alan Hanjalic,et al.  Unleashing the Power of Convolutional Neural Networks for Profiled Side-channel Analysis , 2019 .

[31]  David D. Cox,et al.  On the information bottleneck theory of deep learning , 2018, ICLR.

[32]  Annelie Heuser,et al.  Mind the Portability: A Warriors Guide through Realistic Profiled Side-channel Analysis , 2019, IACR Cryptol. ePrint Arch..

[33]  Stjepan Picek,et al.  I Choose You: Automated Hyperparameter Tuning for Deep Learning-based Side-channel Analysis , 2020, IACR Cryptol. ePrint Arch..

[34]  Yuanyuan Zhou,et al.  Deep learning mitigates but does not annihilate the need of aligned traces and a generalized ResNet model for side-channel attacks , 2019, Journal of Cryptographic Engineering.

[35]  Christof Paar,et al.  A Stochastic Model for Differential Side Channel Cryptanalysis , 2005, CHES.

[36]  Peter J. Bickel,et al.  Maximum Likelihood Estimation of Intrinsic Dimension , 2004, NIPS.

[37]  Máire O'Neill,et al.  Plaintext: A Missing Feature for Enhancing the Power of Deep Learning in Side-Channel Analysis? Breaking multiple layers of side-channel countermeasures , 2020, IACR Trans. Cryptogr. Hardw. Embed. Syst..

[38]  Stjepan Picek,et al.  Reinforcement Learning for Hyperparameter Tuning in Deep Learning-based Side-channel Analysis , 2021, IACR Cryptol. ePrint Arch..

[39]  Joos Vandewalle,et al.  Machine learning in side-channel analysis: a first study , 2011, Journal of Cryptographic Engineering.

[40]  Romain Poussier,et al.  Template Attacks vs. Machine Learning Revisited (and the Curse of Dimensionality in Side-Channel Analysis) , 2015, COSADE.

[41]  Siddharth Krishna Kumar,et al.  On weight initialization in deep neural networks , 2017, ArXiv.

[42]  Lilian Bossuet,et al.  Methodology for Efficient CNN Architectures in Profiling Attacks , 2019, IACR Cryptol. ePrint Arch..

[43]  Risto Miikkulainen,et al.  Discovering Multimodal Behavior in Ms. Pac-Man Through Evolution of Modular Neural Networks , 2016, IEEE Transactions on Computational Intelligence and AI in Games.

[44]  José Carlos Príncipe,et al.  Simple Stopping Criteria for Information Theoretic Feature Selection , 2018, Entropy.

[45]  Ramesh Karri,et al.  A Primer on Hardware Security: Models, Methods, and Metrics , 2014, Proceedings of the IEEE.

[46]  Michel Verleysen,et al.  Resampling methods for parameter-free and robust feature selection with mutual information , 2007, Neurocomputing.

[47]  Annelie Heuser,et al.  The Curse of Class Imbalance and Conflicting Metrics with Machine Learning for Side-channel Evaluations , 2018, IACR Cryptol. ePrint Arch..

[48]  Máire O'Neill,et al.  Neural network based attack on a masked implementation of AES , 2015, 2015 IEEE International Symposium on Hardware Oriented Security and Trust (HOST).

[49]  Jacob Schrum,et al.  Divide and conquer: neuroevolution for multiclass classification , 2018, GECCO.

[50]  Denis Flandre,et al.  A Formal Study of Power Variability Issues and Side-Channel Attacks for Nanoscale Devices , 2011, EUROCRYPT.

[51]  Axel Legay,et al.  On the Performance of Convolutional Neural Networks for Side-Channel Analysis , 2018, SPACE.

[52]  P. Good,et al.  Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses , 1995 .

[53]  Naftali Tishby,et al.  Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.

[54]  Cécile Canovas,et al.  Study of Deep Learning Techniques for Side-Channel Analysis and Introduction to ASCAD Database , 2018, IACR Cryptol. ePrint Arch..

[55]  Tim Güneysu,et al.  Applications of machine learning techniques in side-channel attacks: a survey , 2019, Journal of Cryptographic Engineering.

[56]  Olivier Markowitch,et al.  A machine learning approach against a masked AES , 2014, Journal of Cryptographic Engineering.

[57]  Andrei Zinovyev,et al.  Local intrinsic dimensionality estimators based on concentration of measure , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[58]  François-Xavier Standaert,et al.  Leakage Certification Revisited: Bounding Model Errors in Side-Channel Security Evaluations , 2019, IACR Cryptol. ePrint Arch..

[59]  Marco Gherardi,et al.  Intrinsic dimension estimation for locally undersampled data , 2019, Scientific Reports.

[60]  P. Campadelli,et al.  Intrinsic Dimension Estimation: Relevant Techniques and a Benchmark Framework , 2015 .

[61]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[62]  Stjepan Picek,et al.  Strength in Numbers: Improving Generalization with Ensembles in Machine Learning-based Profiled Side-channel Analysis , 2020, IACR Trans. Cryptogr. Hardw. Embed. Syst..

[63]  Alan Hanjalic,et al.  Make Some Noise: Unleashing the Power of Convolutional Neural Networks for Profiled Side-channel Analysis , 2019, IACR Cryptol. ePrint Arch..

[64]  Rana Ali Amjad,et al.  Learning Representations for Neural Network-Based Classification Using the Information Bottleneck Principle , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  F. Fleuret Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..

[66]  Jose C. Principe,et al.  Measures of Entropy From Data Using Infinitely Divisible Kernels , 2012, IEEE Transactions on Information Theory.

[67]  Zdenek Martinasek,et al.  Optimization of Power Analysis Using Neural Network , 2013, CARDIS.

[68]  Pankaj Rohatgi,et al.  Template Attacks , 2002, CHES.

[69]  Masaaki Imaizumi,et al.  Adaptive Approximation and Generalization of Deep Neural Network with Intrinsic Dimensionality , 2020, J. Mach. Learn. Res..

[70]  Arnaud Doucet,et al.  On the Impact of the Activation Function on Deep Neural Networks Training , 2019, ICML.