The bias–variance decomposition in profiled attacks

The profiled attacks challenge the security of cryptographic devices in the worst case scenario. We elucidate the reasons underlying the success of different profiled attacks (that depend essentially on the context) based on the well-known bias–variance tradeoff developed in the machine learning field. Note that our approach can easily be extended to non-profiled attacks. We show (1) how to decompose (in three additive components) the error rate of an attack based on the bias–variance decomposition, and (2) how to reduce the error rate of a model based on the bias–variance diagnostic. Intuitively, we show that different models having the same error rate require different strategies (according to the bias–variance decomposition) to reduce their errors. More precisely, the success rate of a strategy depends on several criteria such as its complexity, the leakage information and the number of points per trace. As a result, a suboptimal strategy in a specific context can lead the adversary to overestimate the security level of the cryptographic device. Our results also bring warnings related to the estimation of the success rate of a profiled attack that can lead the evaluator to underestimate the security level. In brief, certify that a chip leaks (or not) sensitive information represents a hard if not impossible task.

[1]  Pedro M. Domingos A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.

[2]  Pankaj Rohatgi,et al.  Template Attacks , 2002, CHES.

[3]  Olivier Markowitch,et al.  A Machine Learning Approach Against a Masked AES , 2013, CARDIS.

[4]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[5]  Christof Paar,et al.  Templates vs. Stochastic Methods , 2006, CHES.

[6]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[7]  Joos Vandewalle,et al.  Machine learning in side-channel analysis: a first study , 2011, Journal of Cryptographic Engineering.

[8]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[9]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[10]  Christof Paar,et al.  A Stochastic Model for Differential Side Channel Cryptanalysis , 2005, CHES.

[11]  Olivier Markowitch,et al.  Power analysis attack: an approach based on machine learning , 2014, Int. J. Appl. Cryptogr..

[12]  Pedro M. Domingos A Unified Bias-Variance Decomposition for Zero-One and Squared Loss , 2000, AAAI/IAAI.

[13]  Emmanuel Prouff,et al.  DPA Attacks and S-Boxes , 2005, FSE.

[14]  Annelie Heuser,et al.  Intelligent Machine Homicide - Breaking Cryptographic Devices Using Support Vector Machines , 2012, COSADE.

[15]  Elisabeth Oswald,et al.  Profiling DPA: Efficacy and Efficiency Trade-Offs , 2013, CHES.

[16]  Moti Yung,et al.  A Unified Framework for the Analysis of Side-Channel Key Recovery Attacks (extended version) , 2009, IACR Cryptol. ePrint Arch..

[17]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[18]  Thomas G. Dietterich,et al.  Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms , 2008 .

[19]  Willi Meier,et al.  Nonlinearity Criteria for Cryptographic Functions , 1990, EUROCRYPT.

[20]  Sylvain Guilley,et al.  A Theoretical Study of Kolmogorov-Smirnov Distinguishers: Side-Channel Analysis vs. Differential Cryptanalysis , 2014, IACR Cryptol. ePrint Arch..

[21]  Robert Tibshirani,et al.  Bias, Variance and Prediction Error for Classification Rules , 1996 .

[22]  Pedro M. Domingos A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.

[23]  S. Weisberg Applied Linear Regression: Weisberg/Applied Linear Regression 3e , 2005 .

[24]  Eli Biham,et al.  Differential Cryptanalysis of the Data Encryption Standard , 1993, Springer New York.

[25]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[26]  Emmanuel Prouff,et al.  Block Ciphers Implementations Provably Secure Against Second Order Side Channel Analysis , 2008, FSE.

[27]  Gareth M. James,et al.  Generalizations of the Bias/Variance Decomposition for Prediction Error , 1997 .

[28]  Takeshi Sugawara,et al.  Profiling attack using multivariate regression analysis , 2010, IEICE Electron. Express.

[29]  Tom Heskes,et al.  Bias/Variance Decompositions for Likelihood-Based Estimators , 1998, Neural Computation.

[30]  Mitsuru Matsui,et al.  Linear Cryptanalysis Method for DES Cipher , 1994, EUROCRYPT.

[31]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[32]  Paul C. Kocher,et al.  Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems , 1996, CRYPTO.

[33]  S. Weisberg Applied Linear Regression , 1981 .

[34]  Olivier Markowitch,et al.  A Time Series Approach for Profiling Attack , 2013, SPACE.

[35]  Paul C. Kocher,et al.  Differential Power Analysis , 1999, CRYPTO.

[36]  Leo Breiman,et al.  Randomizing Outputs to Increase Prediction Accuracy , 2000, Machine Learning.

[37]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[38]  François Durvaux,et al.  How to Certify the Leakage of a Chip? , 2014, IACR Cryptol. ePrint Arch..

[39]  Sylvain Guilley,et al.  Portability of templates , 2012, Journal of Cryptographic Engineering.

[40]  Stefan Mangard,et al.  Power analysis attacks - revealing the secrets of smart cards , 2007 .

[41]  Francis Olivier,et al.  Electromagnetic Analysis: Concrete Results , 2001, CHES.

[42]  François-Xavier Standaert,et al.  Univariate side channel attacks and leakage modeling , 2011, Journal of Cryptographic Engineering.

[43]  Kerstin Lemke-Rust,et al.  Efficient Template Attacks Based on Probabilistic Multi-class Support Vector Machines , 2012, CARDIS.

[44]  Andrew Y. Ng,et al.  Preventing "Overfitting" of Cross-Validation Data , 1997, ICML.

[45]  Michael A. Temple,et al.  Improving cross-device attacks using zero-mean unit-variance normalization , 2012, Journal of Cryptographic Engineering.

[46]  Olivier Markowitch,et al.  Side channel attack: an approach based on machine learning , 2011 .

[47]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[48]  L. Breiman Arcing Classifiers , 1998 .

[49]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[50]  Liwei Zhang,et al.  A Statistics-based Fundamental Model for Side-channel Attack Analysis , 2014, IACR Cryptol. ePrint Arch..