An Empirical Study of Accuracy, Fairness, Explainability, Distributional Robustness, and Adversarial Robustness

To ensure trust in AI models, it is becoming increasingly apparent that evaluation of models must be extended beyond traditional performance metrics, like accuracy, to other dimensions, such as fairness, explainability, adversarial robustness, and distribution shift. We describe an empirical study to evaluate multiple model types on various metrics along these dimensions on several datasets. Our results show that no particular model type performs well on all dimensions, and demonstrate the kinds of trade-offs involved in selecting models evaluated along multiple dimensions.

[1]  Alex Pentland,et al.  Active Fairness in Algorithmic Decision Making , 2018, AIES.

[2]  Inioluwa Deborah Raji,et al.  Model Cards for Model Reporting , 2018, FAT.

[3]  Paulo Cortez,et al.  A data-driven approach to predict the success of bank telemarketing , 2014, Decis. Support Syst..

[4]  Tommi S. Jaakkola,et al.  Towards Robust Interpretability with Self-Explaining Neural Networks , 2018, NeurIPS.

[5]  Martin Wistuba,et al.  Adversarial Robustness Toolbox v1.0.0 , 2018, 1807.01069.

[6]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[7]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[8]  W. Baine,et al.  The Agency for Healthcare Research and Quality , 2006, Italian Journal of Public Health.

[9]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[10]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[11]  Jinfeng Yi,et al.  Is Robustness the Cost of Accuracy? - A Comprehensive Study on the Robustness of 18 Deep Image Classification Models , 2018, ECCV.

[12]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[13]  Michael I. Jordan,et al.  HopSkipJumpAttack: A Query-Efficient Decision-Based Attack , 2019, 2020 IEEE Symposium on Security and Privacy (SP).

[14]  Karthikeyan Natesan Ramamurthy,et al.  Understanding racial bias in health using the Medical Expenditure Panel Survey data , 2019, ArXiv.

[15]  Kush R. Varshney,et al.  Trustworthy machine learning and artificial intelligence , 2019, XRDS.

[16]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Rachel K. E. Bellamy,et al.  AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias , 2018, ArXiv.

[18]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[19]  R. Detrano,et al.  International application of a new probability algorithm for the diagnosis of coronary artery disease. , 1989, The American journal of cardiology.

[20]  Kush R. Varshney,et al.  Increasing Trust in AI Services through Supplier's Declarations of Conformity , 2018, IBM J. Res. Dev..