Algorithms for interpretable machine learning

Analysis of algorithms for improving classification model interpretability and proposal of a novel explanation method for explaining individual predictions

[1]  Jude W. Shavlik,et al.  Extracting refined rules from knowledge-based neural networks , 2004, Machine Learning.

[2]  Jude W. Shavlik,et al.  in Advances in Neural Information Processing , 1996 .

[3]  K. Crawford,et al.  Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms , 2013 .

[4]  Erik Strumbelj,et al.  Explaining prediction models and individual predictions with feature contributions , 2014, Knowledge and Information Systems.

[5]  Susan T. Dumais,et al.  Personalized information delivery: an analysis of information filtering methods , 1992, CACM.

[6]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[7]  John Riedl,et al.  Explaining collaborative filtering recommendations , 2000, CSCW '00.

[8]  Constantin F. Aliferis,et al.  An evaluation of machine-learning methods for predicting pneumonia mortality , 1997, Artif. Intell. Medicine.

[9]  Markus Zanker,et al.  Knowledgeable Explanations for Recommender Systems , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[10]  C. Kuner The European Commission's Proposed Data Protection Regulation: A Copernican Revolution in European Data Protection Law , 2012 .

[11]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[12]  Erik Strumbelj,et al.  Towards a Model Independent Method for Explaining Classification for Individual Instances , 2008, DaWaK.

[13]  Seth Flaxman,et al.  European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..

[14]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[15]  T. Graepel,et al.  Private traits and attributes are predictable from digital records of human behavior , 2013, Proceedings of the National Academy of Sciences.

[16]  Kwan-Liu Ma,et al.  Opening the black box - data driven visualization of neural networks , 2005, VIS 05. IEEE Visualization, 2005..

[17]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[18]  Igor Kononenko,et al.  Inductive and Bayesian learning in medical diagnosis , 1993, Appl. Artif. Intell..

[19]  Jacek M. Zurada,et al.  Comparisons of the Performance of Computational Intelligence Methods for Loan Granting Decisions , 2011, 2011 44th Hawaii International Conference on System Sciences.

[20]  Kotagiri Ramamohanarao,et al.  DeEPs: A New Instance-Based Lazy Discovery and Classification System , 2004, Machine Learning.

[21]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[22]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[23]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[24]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[25]  W. Paul Vogt,et al.  The SAGE Dictionary of Statistics & Methodology: A Nontechnical Guide for the Social Sciences , 2015 .

[26]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[27]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[28]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[29]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[30]  S. Rüping Interpreting Classifiers by Multiple Views , 2005 .

[31]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[32]  Johannes Gehrke,et al.  Intelligible models for classification and regression , 2012, KDD.

[33]  Miha Vuk,et al.  ROC curve, lift chart and calibration plot , 2006, Advances in Methodology and Statistics.

[34]  I. Askira-Gelman,et al.  Knowledge discovery: comprehensibility of the results , 1998, Proceedings of the Thirty-First Hawaii International Conference on System Sciences.

[35]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[36]  Jerzy W. Grzymala-Busse,et al.  Rule Induction , 2005, Data Mining and Knowledge Discovery Handbook.

[37]  John David N. Dionisio,et al.  Case-based explanation of non-case-based learning methods , 1999, AMIA.

[38]  J. Lubsen,et al.  A Practical Device for the Application of a Diagnostic or Prognostic Function , 1978, Methods of Information in Medicine.

[39]  Ross D. Shachter,et al.  Patient-specific explanation in models of chronic disease , 1992, Artif. Intell. Medicine.

[40]  D. Hand,et al.  Idiot's Bayes—Not So Stupid After All? , 2001 .

[41]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[42]  Cynthia Rudin,et al.  Methods and Models for Interpretable Linear Classification , 2014, ArXiv.

[43]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[44]  klausdieterborchardt Treaty on the Functioning of the European Union – A Commentary: Volume I: Preamble, Articles 1-89 , 2017, Springer Commentaries on International and European Law.

[45]  David W. Aha,et al.  A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms , 1997, Artificial Intelligence Review.

[46]  Martin Mozina,et al.  Nomograms for Visualization of Naive Bayesian Classifier , 2004, PKDD.

[47]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[48]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[49]  Marko Robnik-Sikonja,et al.  Explaining Classifications For Individual Instances , 2008, IEEE Transactions on Knowledge and Data Engineering.

[50]  Izak Benbasat,et al.  Behavioral Aspects of Information Processing for the Design of Management Information Systems , 1982, IEEE Transactions on Systems, Man, and Cybernetics.

[51]  Li Chen,et al.  Trust building with explanation interfaces , 2006, IUI '06.

[52]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[53]  François Poulet,et al.  SVM and graphical algorithms: a cooperative approach , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[54]  Erik Strumbelj,et al.  Explaining instance classifications with interactions of subsets of feature values , 2009, Data Knowl. Eng..

[55]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[56]  K. Pearson The Grammar of Science , 1892, Nature.

[57]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[58]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[59]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[60]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[61]  Vasant Honavar,et al.  Gaining insights into support vector machine pattern classifiers using projection-based tour methods , 2001, KDD '01.

[62]  Nuria Oliver,et al.  The Tyranny of Data? The Bright and Dark Sides of Data-Driven Decision-Making for Social Good , 2016, ArXiv.

[63]  R. Mike Cameron-Jones,et al.  FOIL: A Midterm Report , 1993, ECML.

[64]  Bart Baesens,et al.  An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models , 2011, Decis. Support Syst..

[65]  Toon Calders,et al.  Three naive Bayes approaches for discrimination-free classification , 2010, Data Mining and Knowledge Discovery.

[66]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[67]  Claus Weihs,et al.  Combining Mental Fit and Data Fit for Classification Rule Selection , 2001 .

[68]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[69]  Udo Seiffert,et al.  Classification in high-dimensional spectral data: Accuracy vs. interpretability vs. model size , 2014, Neurocomputing.

[70]  Alex Alves Freitas,et al.  Comprehensible classification models: a position paper , 2014, SKDD.

[71]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[72]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[73]  Geert Wets,et al.  From Decision Tables to Expert System Shells , 1994, Data Knowl. Eng..

[74]  Robert M. Colomb,et al.  Representation of Propositional Expert Systems as Partial Functions , 1999, Artif. Intell..

[75]  Paul Davidsson,et al.  Evaluating learning algorithms and classifiers , 2007, Int. J. Intell. Inf. Database Syst..

[76]  Nada Lavrac,et al.  Selected techniques for data mining in medicine , 1999, Artif. Intell. Medicine.

[77]  P. H. Sönksen,et al.  Data mining for indicators of early mortality in a database of clinical records , 2001, Artif. Intell. Medicine.

[78]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[79]  Elena Baralis,et al.  A Lazy Approach to Associative Classification , 2008, IEEE Transactions on Knowledge and Data Engineering.

[80]  Luciano Floridi,et al.  Why a Right to Explanation of Automated Decision-Making Does Not Exist in the General Data Protection Regulation , 2017 .

[81]  Matjaz Gams,et al.  Comprehensibility of Classification Trees–Survey Design , 2019 .

[82]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[83]  Erik Strumbelj,et al.  An Efficient Explanation of Individual Classifications using Game Theory , 2010, J. Mach. Learn. Res..

[84]  Usama M. Fayyad,et al.  Knowledge Discovery in Databases: An Overview , 1997, ILP.

[85]  Hendrik Blockeel,et al.  Seeing the Forest Through the Trees: Learning a Comprehensible Model from an Ensemble , 2007, ECML.

[86]  T. Lombrozo The structure and function of explanations , 2006, Trends in Cognitive Sciences.

[87]  N. Cowan The magical number 4 in short-term memory: A reconsideration of mental storage capacity , 2001, Behavioral and Brain Sciences.

[88]  Huan Liu,et al.  Understanding Neural Networks via Rule Extraction , 1995, IJCAI.

[89]  Gustavo E. A. P. A. Batista,et al.  How k-nearest neighbor parameters affect its performance , 2009 .

[90]  Mark R. Wade,et al.  Construction and Assessment of Classification Rules , 1999, Technometrics.

[91]  Benoît Frénay,et al.  Interpretability of machine learning models and representations: an introduction , 2016, ESANN.

[92]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[93]  Ruby B. Lee,et al.  Hardware-rooted trust for secure key management and transient trust , 2007, CCS '07.

[94]  Niklas Lavesson,et al.  User-oriented Assessment of Classification Model Understandability , 2011, SCAI.

[95]  Jianyong Wang,et al.  HARMONY: Efficiently Mining the Best Rules for Classification , 2005, SDM.

[96]  Rok Piltaver,et al.  Comprehensibility of Classification Trees – Survey Design Validation , 2014 .

[97]  Ivan Bratko,et al.  Nomograms for visualizing support vector machines , 2005, KDD '05.

[98]  Alexander Mordvintsev,et al.  Inceptionism: Going Deeper into Neural Networks , 2015 .

[99]  G F Cooper,et al.  The use of misclassification costs to learn rule-based decision support models for cost-effective hospital admission strategies. , 1995, Proceedings. Symposium on Computer Applications in Medical Care.

[100]  Kenney Ng,et al.  Interacting with Predictions: Visual Inspection of Black-box Machine Learning Models , 2016, CHI.

[101]  D. Pager,et al.  The Sociology of Discrimination: Racial Discrimination in Employment, Housing, Credit, and Consumer Markets. , 2008, Annual review of sociology.

[102]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[103]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[104]  Yang Wang,et al.  An Overview of Associative Classifiers , 2006, DMIN.

[105]  Joachim Diederich,et al.  Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..

[106]  Ron Kohavi,et al.  Visualizing the Simple Bayesian Classi er , 1997 .