Certifying Decision Trees Against Evasion Attacks by Program Analysis

Machine learning has proved invaluable for a range of different tasks, yet it also proved vulnerable to evasion attacks, i.e., maliciously crafted perturbations of input data designed to force mispredictions. In this paper we propose a novel technique to verify the security of decision tree models against evasion attacks with respect to an expressive threat model, where the attacker can be represented by an arbitrary imperative program. Our approach exploits the interpretability property of decision trees to transform them into imperative programs, which are amenable for traditional program analysis techniques. By leveraging the abstract interpretation framework, we are able to soundly verify the security guarantees of decision tree models trained over publicly available datasets. Our experiments show that our technique is both precise and efficient, yielding only a minimal number of false positives and scaling up to cases which are intractable for a competitor approach.

[1]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[2]  Fabio Roli,et al.  Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning , 2018, CCS.

[3]  Cho-Jui Hsieh,et al.  Robust Decision Trees Against Adversarial Examples , 2019 .

[4]  Patrick Cousot,et al.  Systematic design of program analysis frameworks , 1979, POPL.

[5]  Somesh Jha,et al.  Semantic Adversarial Deep Learning , 2018, IEEE Design & Test.

[6]  Simin Nadjm-Tehrani,et al.  An Abstraction-Refinement Approach to Formal Verification of Tree Ensembles , 2019, SAFECOMP Workshops.

[7]  Claudio Lucchese,et al.  Adversarial Training of Gradient-Boosted Decision Trees , 2019, CIKM.

[8]  Nicolas Halbwachs,et al.  Automatic discovery of linear restraints among variables of a program , 1978, POPL.

[9]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[10]  Weiming Xiang,et al.  Verification for Machine Learning, Autonomy, and Neural Networks Survey , 2018, ArXiv.

[11]  Min Wu,et al.  Safety Verification of Deep Neural Networks , 2016, CAV.

[12]  Patrick D. McDaniel,et al.  Making machine learning robust against adversarial inputs , 2018, Commun. ACM.

[13]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  Junfeng Yang,et al.  Efficient Formal Safety Analysis of Neural Networks , 2018, NeurIPS.

[16]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[17]  Yang Li,et al.  Robustness Verification of Tree-based Models , 2019, NeurIPS.

[18]  J. Doug Tygar,et al.  Evasion and Hardening of Tree Ensemble Classifiers , 2015, ICML.

[19]  Junfeng Yang,et al.  Formal Security Analysis of Neural Networks using Symbolic Intervals , 2018, USENIX Security Symposium.

[20]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[21]  Bertrand Jeannet,et al.  Apron: A Library of Numerical Abstract Domains for Static Analysis , 2009, CAV.

[22]  Swarat Chaudhuri,et al.  AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[23]  Francesco Ranzato,et al.  Abstract Interpretation of Decision Tree Ensemble Classifiers , 2020, AAAI.

[24]  Ciera Jaspan,et al.  Lessons from building static analysis tools at Google , 2018, Commun. ACM.

[25]  Yaniv Sa'ar,et al.  Verifying Robustness of Gradient Boosted Models , 2019, AAAI.

[26]  Antoine Miné,et al.  The octagon abstract domain , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[27]  Claudio Lucchese,et al.  Treant: training evasion-aware decision trees , 2019, Data Mining and Knowledge Discovery.

[28]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[29]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.