论文信息 - StrategyAtlas: Strategy Analysis for Machine Learning Interpretability

StrategyAtlas: Strategy Analysis for Machine Learning Interpretability

Businesses in high-risk environments have been reluctant to adopt modern machine learning approaches due to their complex and uninterpretable nature. Most current solutions provide local, instance-level explanations, but this is insufficient for understanding the model as a whole. In this work, we show that strategy clusters (i.e., groups of data instances that are treated distinctly by the model) can be used to understand the global behavior of a complex ML model. To support effective exploration and understanding of these clusters, we introduce StrategyAtlas, a system designed to analyze and explain model strategies. Furthermore, it supports multiple ways to utilize these strategies for simplifying and improving the reference model. In collaboration with a large insurance company, we present a use case in automatic insurance acceptance, and show how professional data scientists were enabled to understand a complex model and improve the production model based on these insights.

J. V. van Wijk | Dennis Collaris | J. van Wijk

[1] Luis Gustavo Nonato,et al. Melody: Generating and Visualizing Machine Learning Model Summary to Understand Data and Classifiers Together , 2020, ArXiv.

[2] Jun Yuan,et al. SUBPLEX: Towards a Better Understanding of Black Box Model Explanations at the Subpopulation Level , 2020, ArXiv.

[3] F. Rossi,et al. The State of the Art in Enhancing Trust in Machine Learning Models with the Use of Visualizations , 2020, Comput. Graph. Forum.

[4] Jarke J. van Wijk,et al. ExplainExplore: Visual Exploration of Machine Learning Explanations , 2020, 2020 IEEE Pacific Visualization Symposium (PacificVis).

[5] Harmanpreet Kaur,et al. Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning , 2020, CHI.

[6] Mennatallah El-Assady,et al. explAIner: A Visual Analytics Framework for Interactive and Explainable Machine Learning , 2019, IEEE Transactions on Visualization and Computer Graphics.

[7] Martin Wattenberg,et al. The What-If Tool: Interactive Probing of Machine Learning Models , 2019, IEEE Transactions on Visualization and Computer Graphics.

[8] Duen Horng Chau,et al. Summit: Scaling Deep Learning Interpretability by Visualizing Activation and Attribution Summarizations , 2019, IEEE Transactions on Visualization and Computer Graphics.

[9] Steven M. Drucker,et al. Gamut: A Design Probe to Understand How Data Scientists Understand Machine Learning Models , 2019, CHI.

[10] Jarke J. van Wijk,et al. V‐Awake: A Visual Analytics Approach for Correcting Sleep Predictions from Deep Learning Models , 2019, Comput. Graph. Forum.

[11] Dik Lun Lee,et al. iForest: Interpreting Random Forests via Visual Analytics , 2019, IEEE Transactions on Visualization and Computer Graphics.

[12] Martin Wattenberg,et al. GAN Lab: Understanding Complex Deep Generative Models using Interactive Visual Experimentation , 2018, IEEE Transactions on Visualization and Computer Graphics.

[13] Carlos Eduardo Scheidegger,et al. DimReader: Axis lines that explain non-linear projections , 2017, IEEE Transactions on Visualization and Computer Graphics.

[14] Huamin Qu,et al. RuleMatrix: Visualizing and Understanding Classifiers with Rules , 2018, IEEE Transactions on Visualization and Computer Graphics.

[15] Alexander M. Rush,et al. Seq2seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models , 2018, IEEE Transactions on Visualization and Computer Graphics.

[16] Marco Cavallo,et al. Clustrophile 2: Guided Visual Clustering Analysis , 2018, IEEE Transactions on Visualization and Computer Graphics.

[17] Tim Miller,et al. Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[18] Zhimin Li,et al. NLIZE: A Perturbation-Driven Visual Interrogation Tool for Analyzing and Interpreting Natural Language Inference Models , 2019, IEEE Transactions on Visualization and Computer Graphics.

[19] Cynthia Rudin,et al. An Interpretable Model with Globally Consistent Explanations for Credit Risk , 2018, ArXiv.

[20] Leland McInnes,et al. UMAP: Uniform Manifold Approximation and Projection , 2018, J. Open Source Softw..

[21] Jarke J. van Wijk,et al. Instance-Level Explanations for Fraud Detection: A Case Study , 2018, ICML 2018.

[22] Elmar Eisemann,et al. DeepEyes: Progressive Visual Analytics for Designing Deep Neural Networks , 2018, IEEE Transactions on Visualization and Computer Graphics.

[23] Xiaoming Liu,et al. Do Convolutional Neural Networks Learn Class Hierarchy? , 2017, IEEE Transactions on Visualization and Computer Graphics.

[24] Minsuk Kahng,et al. ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models , 2017, IEEE Transactions on Visualization and Computer Graphics.

[25] Josua Krause,et al. A User Study on the Effect of Aggregating Explanations for Interpreting Machine Learning Models , 2018 .

[26] Daniel A. Keim,et al. What you see is what you can change: Human-centered machine learning by interactive visualization , 2017, Neurocomputing.

[27] Zhen Li,et al. Understanding Hidden Memories of Recurrent Neural Networks , 2017, 2017 IEEE Conference on Visual Analytics Science and Technology (VAST).

[28] Çagatay Demiralp,et al. Clustrophile: A Tool for Visual Clustering Analysis , 2017, ArXiv.

[29] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[30] Margo I. Seltzer,et al. Learning Certifiably Optimal Rule Lists , 2017, KDD.

[31] Paulo E. Rauber,et al. Visualizing the Hidden Activity of Artificial Neural Networks , 2017, IEEE Transactions on Visualization and Computer Graphics.

[32] Enrico Bertini,et al. Using Visual Analytics to Interpret Predictive Machine Learning Models , 2016, ArXiv.

[33] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[34] Shie Mannor,et al. Graying the black box: Understanding DQNs , 2016, ICML.

[35] Jaak Vilo,et al. ClustVis: a web tool for visualizing clustering of multivariate data using Principal Component Analysis and heatmap , 2015, Nucleic Acids Res..

[36] Jason Yosinski,et al. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Toon Calders,et al. Predicting Current User Intent with Contextual Markov Models , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.

[38] Tamara Munzner,et al. A Multi-Level Typology of Abstract Visualization Tasks , 2013, IEEE Transactions on Visualization and Computer Graphics.

[39] Fabio Roli,et al. Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[40] Johannes Gehrke,et al. Accurate intelligible models with pairwise interactions , 2013, KDD.

[41] Wen-Ching Lin,et al. The PMML path towards true interoperability in data mining , 2011, PMML '11.

[42] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[43] Wei-Yin Loh,et al. Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[44] M. Sheelagh T. Carpendale,et al. Evaluating Information Visualizations , 2008, Information Visualization.

[45] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[46] Cynthia A. Brewer,et al. ColorBrewer.org: An Online Tool for Selecting Colour Schemes for Maps , 2003 .

[47] Ben Shneiderman,et al. Interactively Exploring Hierarchical Clustering Results , 2002, Computer.

[48] J. Gower. A General Coefficient of Similarity and Some of Its Properties , 1971 .

[49] V. A. Epanechnikov. Non-Parametric Estimation of a Multivariate Probability Density , 1969 .

[50] D. Shepard. A two-dimensional interpolation function for irregularly-spaced data , 1968, ACM National Conference.