The Curse of Class Imbalance and Conflicting Metrics with Machine Learning for Side-channel Evaluations

We concentrate on machine learning techniques used for profiled side-channel analysis in the presence of imbalanced data. Such scenarios are realistic and often occurring, for instance in the Hamming weight or Hamming distance leakage models. In order to deal with the imbalanced data, we use various balancing techniques and we show that most of them help in mounting successful attacks when the data is highly imbalanced. Especially, the results with the SMOTE technique are encouraging, since we observe some scenarios where it reduces the number of necessary measurements more than 8 times. Next, we provide extensive results on comparison of machine learning and side-channel metrics, where we show that machine learning metrics (and especially accuracy as the most often used one) can be extremely deceptive. This finding opens a need to revisit the previous works and their results in order to properly assess the performance of machine learning in side-channel analysis.

[1]  Eric Peeters,et al.  On the masking countermeasure and higher-order power analysis attacks , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[2]  Sabri Boughorbel,et al.  Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric , 2017, PloS one.

[3]  François Durvaux,et al.  From Improved Leakage Detection to the Detection of Points of Interests in Leakage Traces , 2016, EUROCRYPT.

[4]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[5]  Fernando De la Torre,et al.  Facing Imbalanced Data--Recommendations for the Use of Performance Metrics , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[6]  Sylvain Guilley,et al.  Template attack versus Bayes classifier , 2017, Journal of Cryptographic Engineering.

[7]  Christophe Clavier,et al.  Correlation Power Analysis with a Leakage Model , 2004, CHES.

[8]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[9]  Siva Sai Yerubandi,et al.  Differential Power Analysis , 2002 .

[10]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[11]  Axel Legay,et al.  On the Performance of Convolutional Neural Networks for Side-Channel Analysis , 2018, SPACE.

[12]  Annelie Heuser,et al.  Intelligent Machine Homicide - Breaking Cryptographic Devices Using Support Vector Machines , 2012, COSADE.

[13]  Pankaj Rohatgi,et al.  Template Attacks , 2002, CHES.

[14]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[15]  Stephen Kwek,et al.  Applying Support Vector Machines to Imbalanced Datasets , 2004, ECML.

[16]  P. Rohatgi,et al.  Test Vector Leakage Assessment ( TVLA ) methodology in practice , 2013 .

[17]  Seetha Hari,et al.  Learning From Imbalanced Data , 2019, Advances in Computer and Electrical Engineering.

[18]  Sylvain Guilley,et al.  Side-channel analysis and machine learning: A practical perspective , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[19]  Lejla Batina,et al.  Mutual Information Analysis: a Comprehensive Study , 2011, Journal of Cryptology.

[20]  Stefan Mangard,et al.  Power analysis attacks - revealing the secrets of smart cards , 2007 .

[21]  Jean-Sébastien Coron,et al.  An Efficient Method for Random Delay Generation in Embedded Software , 2009, CHES.

[22]  C. Lee Giles,et al.  Active learning for class imbalance problem , 2007, SIGIR.

[23]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[24]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[25]  Markus G. Kuhn,et al.  Efficient Template Attacks , 2013, CARDIS.

[26]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[27]  Emmanuel Prouff,et al.  Breaking Cryptographic Implementations Using Deep Learning Techniques , 2016, SPACE.

[28]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[29]  Amir Moradi,et al.  Dual-rail transition logic: A logic style for counteracting power analysis attacks , 2009, Comput. Electr. Eng..

[30]  Sylvain Guilley,et al.  Lightweight Ciphers and Their Side-Channel Resilience , 2020, IEEE Transactions on Computers.

[31]  Romain Poussier,et al.  Template Attacks vs. Machine Learning Revisited (and the Curse of Dimensionality in Side-Channel Analysis) , 2015, COSADE.

[32]  Sylvain Guilley,et al.  Countering early evaluation: an approach towards robust dual-rail precharge logic , 2010, WESS '10.

[33]  Taghi M. Khoshgoftaar,et al.  The Effect of Data Sampling When Using Random Forest on Imbalanced Bioinformatics Data , 2015, 2015 IEEE International Conference on Information Reuse and Integration.

[34]  Moti Yung,et al.  A Unified Framework for the Analysis of Side-Channel Key Recovery Attacks (extended version) , 2009, IACR Cryptol. ePrint Arch..

[35]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[36]  François-Xavier Standaert,et al.  Univariate side channel attacks and leakage modeling , 2011, Journal of Cryptographic Engineering.

[37]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[38]  Máire O'Neill,et al.  Neural network based attack on a masked implementation of AES , 2015, 2015 IEEE International Symposium on Hardware Oriented Security and Trust (HOST).

[39]  Cécile Canovas,et al.  Convolutional Neural Networks with Data Augmentation Against Jitter-Based Countermeasures - Profiling Attacks Without Pre-processing , 2017, CHES.

[40]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[41]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[42]  Joos Vandewalle,et al.  Machine learning in side-channel analysis: a first study , 2011, Journal of Cryptographic Engineering.

[43]  Christof Paar,et al.  A Stochastic Model for Differential Side Channel Cryptanalysis , 2005, CHES.

[44]  Bartosz Krawczyk,et al.  Learning from imbalanced data: open challenges and future directions , 2016, Progress in Artificial Intelligence.

[45]  Rushi Longadge,et al.  Class Imbalance Problem in Data Mining Review , 2013, ArXiv.

[46]  Olivier Markowitch,et al.  A machine learning approach against a masked AES , 2014, Journal of Cryptographic Engineering.