Pay Attention to Raw Traces: A Deep Learning Architecture for End-to-End Profiling Attacks

With the renaissance of deep learning, the side-channel community also notices the potential of this technology, which is highly related to the profiling attacks in the side-channel context. Many papers have recently investigated the abilities of deep learning in profiling traces. Some of them also aim at the countermeasures (e.g., masking) simultaneously. Nevertheless, so far, all of these papers work with an (implicit) assumption that the number of time samples in raw traces can be reduced before the profiling, i.e., the position of points of interest (PoIs) can be manually located. This is arguably the most challenging part of a practical black-box analysis targeting an implementation protected by masking. Therefore, we argue that to fully utilize the potential of deep learning and get rid of any manual intervention, the end-to-end profiling directly mapping raw traces to target intermediate values is demanded.In this paper, we propose a neural network architecture that consists of encoders, attention mechanisms and a classifier, to conduct the end-to-end profiling. The networks built by our architecture could directly classify the traces that contain a large number of time samples (i.e., raw traces without manual feature extraction) while whose underlying implementation is protected by masking. We validate our networks on several public datasets, i.e., DPA contest v4 and ASCAD, where over 100,000 time samples are directly used in profiling. To our best knowledge, we are the first that successfully carry out end-to-end profiling attacks. The results on the datasets indicate that our networks could get rid of the tricky manual feature extraction. Moreover, our networks perform even systematically better (w.r.t. the number of traces in attacks) than those trained on the reduced traces. These validations imply our approach is not only a first but also a concrete step towards end-to-end profiling attacks in the side-channel context.

[1]  Joos Vandewalle,et al.  Machine learning in side-channel analysis: a first study , 2011, Journal of Cryptographic Engineering.

[2]  Emmanuel Prouff,et al.  Breaking Cryptographic Implementations Using Deep Learning Techniques , 2016, SPACE.

[3]  Liwei Zhang,et al.  A Statistical Model for Higher Order DPA on Masked Devices , 2014, IACR Cryptol. ePrint Arch..

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Petr Dzurenda,et al.  Profiling power analysis attack based on MLP in DPA contest V4.2 , 2016, 2016 39th International Conference on Telecommunications and Signal Processing (TSP).

[6]  François-Xavier Standaert,et al.  Using Subspace-Based Template Attacks to Compare and Combine Power and Electromagnetic Information Leakages , 2008, CHES.

[7]  Nenghai Yu,et al.  An Enhanced Convolutional Neural Network in Side-Channel Attacks and Its Visualization , 2020, ArXiv.

[8]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[9]  Annelie Heuser,et al.  Intelligent Machine Homicide - Breaking Cryptographic Devices Using Support Vector Machines , 2012, COSADE.

[10]  Pankaj Rohatgi,et al.  Template Attacks , 2002, CHES.

[11]  Emmanuel Prouff,et al.  A Generic Method for Secure SBox Implementation , 2007, WISA.

[12]  Cécile Canovas,et al.  Convolutional Neural Networks with Data Augmentation Against Jitter-Based Countermeasures - Profiling Attacks Without Pre-processing , 2017, CHES.

[13]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[14]  Jürgen Schmidhuber,et al.  Recurrent nets that time and count , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[15]  Sylvain Guilley,et al.  RSM: A small and fast countermeasure for AES, secure against 1st and 2nd-order zero-offset SCAs , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[16]  Cécile Canovas,et al.  Study of Deep Learning Techniques for Side-Channel Analysis and Introduction to ASCAD Database , 2018, IACR Cryptol. ePrint Arch..

[17]  Tara N. Sainath,et al.  Locally-connected and convolutional neural networks for small footprint speaker recognition , 2015, INTERSPEECH.

[18]  Paul C. Kocher,et al.  Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems , 1996, CRYPTO.

[19]  Lei Zhang,et al.  Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  H. Jaeger A tutorial on training recurrent neural networks, covering BPTT, RTRL, EKF and the "echo state network" approach , 2021 .

[21]  Thomas S. Messerges,et al.  Using Second-Order Power Analysis to Attack DPA Resistant Software , 2000, CHES.

[22]  Romain Poussier,et al.  Template Attacks vs. Machine Learning Revisited (and the Curse of Dimensionality in Side-Channel Analysis) , 2015, COSADE.

[23]  Lukasz Kaiser,et al.  Reformer: The Efficient Transformer , 2020, ICLR.

[24]  Kerstin Lemke-Rust,et al.  Efficient Template Attacks Based on Probabilistic Multi-class Support Vector Machines , 2012, CARDIS.

[25]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[26]  Andrew J. Davison,et al.  End-To-End Multi-Task Learning With Attention , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Colin Raffel,et al.  Online and Linear-Time Attention by Enforcing Monotonic Alignments , 2017, ICML.

[28]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[29]  François Durvaux,et al.  From Improved Leakage Detection to the Detection of Points of Interests in Leakage Traces , 2016, EUROCRYPT.

[30]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Siva Sai Yerubandi,et al.  Differential Power Analysis , 2002 .

[32]  Alan Hanjalic,et al.  Make Some Noise: Unleashing the Power of Convolutional Neural Networks for Profiled Side-channel Analysis , 2019, IACR Cryptol. ePrint Arch..

[33]  Yves Deville,et al.  Efficient Selection of Time Samples for Higher-Order DPA with Projection Pursuits , 2015, COSADE.

[34]  Christof Paar,et al.  Gaussian Mixture Models for Higher-Order Side Channel Analysis , 2007, CHES.

[35]  Christophe Giraud,et al.  An Implementation of DES and AES, Secure against Some Attacks , 2001, CHES.

[36]  Olivier Markowitch,et al.  Power analysis attack: an approach based on machine learning , 2014, Int. J. Appl. Cryptogr..

[37]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[38]  Annelie Heuser,et al.  The Curse of Class Imbalance and Conflicting Metrics with Machine Learning for Side-channel Evaluations , 2018, IACR Cryptol. ePrint Arch..

[39]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[40]  Sylvain Guilley,et al.  Analysis and Improvements of the DPA Contest v4 Implementation , 2014, SPACE.

[41]  Mohamed Chtourou,et al.  On the training of recurrent neural networks , 2011, Eighth International Multi-Conference on Systems, Signals & Devices.

[42]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[43]  Lilian Bossuet,et al.  Methodology for Efficient CNN Architectures in Profiling Attacks , 2019, IACR Cryptol. ePrint Arch..

[44]  Werner Schindler,et al.  Advanced stochastic methods in side channel analysis on block ciphers in the presence of masking , 2008, J. Math. Cryptol..

[45]  Cécile Canovas,et al.  A Comprehensive Study of Deep Learning for Side-Channel Analysis , 2019, IACR Cryptol. ePrint Arch..

[46]  Bart Preneel,et al.  Revisiting a Methodology for Efficient CNN Architectures in Profiling Attacks , 2020, IACR Trans. Cryptogr. Hardw. Embed. Syst..