Machine Learning Techniques for Accountability

Copyright © 2021, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602 SPRING 2021 47 Our goal, in this short overview article, is to begin mapping the landscape of methods for accountability of artificial intelligence (AI) systems. For our purposes, we shall define accountability as being able to ascertain whether an AI system is behaving as promised, which is necessary for determining blame-worthiness. In the context of a self-driving car, AI system accountability could be a question of safety; in the context of credit scoring, AI system accountability could be a question of fairness. In an algorithmic trading system, the AI system accountability could be a question of performance and robustness to certain shocks. In this overview, we will not focus on any particular objective (such as safety, fairness, or robustness); we believe that defining and refining these objectives for each context is a moral decision that must be made by the public and their representatives, not technologists. Rather, our goal is to begin the process of mapping the categories of methods that one could use to assess whether an AI system is meeting its objectives. Artificial intelligence systems have provided us with many everyday conveniences. We can easily search for information across millions of webpages via text and voice. Paperwork processing is increasingly automated. Artificial intelligence systems flag potentially fraudulent credit-card transactions and filter our e-mail. Yet these artificial intelligence systems have also experienced significant failings. Across a range of applications, including loan approvals, disease severity scores, hiring algorithms, and face recognition, artificial-intelligence–based scoring systems have exhibited gender and racial bias. Self-driving cars have had serious accidents. As these systems become more prevalent, it is increasingly important that we identify the best ways to keep them accountable. Machine Learning Techniques for Accountability

[1]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[2]  Inioluwa Deborah Raji,et al.  Model Cards for Model Reporting , 2018, FAT.

[3]  Martin Wattenberg,et al.  Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[4]  Cem Anil,et al.  Sorting out Lipschitz function approximation , 2018, ICML.

[5]  Finale Doshi-Velez,et al.  Mind the Gap: A Generative Approach to Interpretable Feature Selection and Extraction , 2015, NIPS.

[6]  Johannes Gehrke,et al.  Intelligible models for classification and regression , 2012, KDD.

[7]  Yang Zhang,et al.  A Theoretical Explanation for Perplexing Behaviors of Backpropagation-based Visualizations , 2018, ICML.

[8]  Ramprasaath R. Selvaraju,et al.  Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization , 2016 .

[9]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[10]  James Zou,et al.  Making AI Forget You: Data Deletion in Machine Learning , 2019, NeurIPS.

[11]  Maya R. Gupta,et al.  Deep Lattice Networks and Partial Monotonic Functions , 2017, NIPS.

[12]  Cynthia Rudin,et al.  The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification , 2014, NIPS.

[13]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[14]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[15]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[16]  Been Kim,et al.  Sanity Checks for Saliency Maps , 2018, NeurIPS.

[17]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[18]  Martin Wattenberg,et al.  SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[19]  Percy Liang,et al.  Certified Defenses for Data Poisoning Attacks , 2017, NIPS.

[20]  Cynthia Rudin,et al.  Interpretable Image Recognition with Hierarchical Prototypes , 2019, HCOMP.

[21]  Timnit Gebru,et al.  Datasheets for datasets , 2018, Commun. ACM.

[22]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[23]  Ryan P. Adams,et al.  Graph-Sparse LDA: A Topic Model with Structured Sparsity , 2014, AAAI.

[24]  Tim Head,et al.  Reproducible Research Environments with Repo2Docker , 2018 .

[25]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[26]  Cynthia Rudin,et al.  Methods and Models for Interpretable Linear Classification , 2014, ArXiv.

[27]  Maya R. Gupta,et al.  Monotonic Calibrated Interpolated Look-Up Tables , 2015, J. Mach. Learn. Res..