cleverhans is a software library that provides standardized reference implementations of adversarial example construction techniques and adversarial training. The library may be used to develop more robust machine learning models and to provide standardized benchmarks of models’ performance in the adversarial setting. Benchmarks constructed without a standardized implementation of adversarial example construction are not comparable to each other, because a good result may indicate a robust model or it may merely indicate a weak implementation of the adversarial example construction procedure. This technical report is structured as follows. Section 1 provides an overview of adversarial examples in machine learning and of the cleverhans software. Section 2 presents the core functionalities of the library: namely the attacks based on adversarial examples and defenses to improve the robustness of machine learning models to these attacks. Section 3 describes how to report benchmark results using the library. Section 4 describes the versioning system.
[1]
Martín Abadi,et al.
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
,
2016,
ArXiv.
[2]
Joan Bruna,et al.
Intriguing properties of neural networks
,
2013,
ICLR.
[3]
John Salvatier,et al.
Theano: A Python framework for fast computation of mathematical expressions
,
2016,
ArXiv.
[4]
Jonathon Shlens,et al.
Explaining and Harnessing Adversarial Examples
,
2014,
ICLR.
[5]
Ananthram Swami,et al.
The Limitations of Deep Learning in Adversarial Settings
,
2015,
2016 IEEE European Symposium on Security and Privacy (EuroS&P).
[6]
Fabio Roli,et al.
Evasion Attacks against Machine Learning at Test Time
,
2013,
ECML/PKDD.