White-box Induction From SVM Models: Explainable AI with Logic Programming

Abstract We focus on the problem of inducing logic programs that explain models learned by the support vector machine (SVM) algorithm. The top-down sequential covering inductive logic programming (ILP) algorithms (e.g., FOIL) apply hill-climbing search using heuristics from information theory. A major issue with this class of algorithms is getting stuck in local optima. In our new approach, however, the data-dependent hill-climbing search is replaced with a model-dependent search where a globally optimal SVM model is trained first, then the algorithm looks into support vectors as the most influential data points in the model, and induces a clause that would cover the support vector and points that are most similar to that support vector. Instead of defining a fixed hypothesis search space, our algorithm makes use of SHAP, an example-specific interpreter in explainable AI, to determine a relevant set of features. This approach yields an algorithm that captures the SVM model’s underlying logic and outperforms other ILP algorithms in terms of the number of induced clauses and classification evaluation metrics.

[1]  Farhad Shakerin,et al.  Whitebox Induction of Default Rules Using High-Utility Itemset Mining , 2020, PADL.

[2]  Jiawei Han,et al.  Frequent Pattern Mining , 2018, Data Mining and Machine Learning.

[3]  Hamido Fujita,et al.  A survey of incremental high‐utility itemset mining , 2018, WIREs Data Mining Knowl. Discov..

[4]  Scott M. Lundberg,et al.  Consistent Individualized Feature Attribution for Tree Ensembles , 2018, ArXiv.

[5]  Gopal Gupta,et al.  A new algorithm to automate inductive learning of default theories* , 2017, Theory and Practice of Logic Programming.

[6]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[7]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[8]  Marco Tulio Ribeiro,et al.  “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.

[9]  Jignesh M. Patel,et al.  QuickFOIL: Scalable Inductive Logic Programming , 2014, Proc. VLDB Endow..

[10]  V. S. Costa,et al.  Inductive Logic Programming , 2014, Lecture Notes in Computer Science.

[11]  Vincent S. Tseng,et al.  FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning , 2014, ISMIS.

[12]  M. Gelfond,et al.  Knowledge Representation, Reasoning, and the Design of Intelligent Agents: The Answer-Set Programming Approach , 2014 .

[13]  Peter A. Flach,et al.  ILP turns 20 , 2011, Machine Learning.

[14]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[15]  Bart Baesens,et al.  Minerva: Sequential Covering for Rule Extraction , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  J. Diederich Rule Extraction from Support Vector Machines , 2008, Studies in Computational Intelligence.

[17]  Joachim Diederich,et al.  Rule Extraction from Support Vector Machines: An Introduction , 2008, Rule Extraction from Support Vector Machines.

[18]  Bart Baesens,et al.  ITER: An Algorithm for Predictive Regression Rule Extraction , 2006, DaWaK.

[19]  Luc De Raedt,et al.  kFOIL: Learning Simple Relational Kernels , 2006, AAAI.

[20]  S. Muggleton,et al.  Support Vector Inductive Logic Programming , 2005, Discovery Science.

[21]  S. Sandilya,et al.  Rule extraction from linear support vector machines , 2005, KDD '05.

[22]  Luc De Raedt,et al.  nFOIL: Integrating Naïve Bayes and FOIL , 2005, AAAI.

[23]  Chiaki Sakama,et al.  Induction from answer sets in nonmonotonic logic programs , 2005, TOCL.

[24]  Mark W. Craven,et al.  Extracting Tree-Structured Representations of Trained Networks , 1995, NIPS.

[25]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[26]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[27]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[28]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[29]  J. Ross Quinlan,et al.  Learning logical definitions from relations , 1990, Machine Learning.

[30]  Jerry Chun-Wei Lin,et al.  A Survey of High Utility Itemset Mining , 2019, Studies in Big Data.

[31]  Philip S. Yu,et al.  Efficient Algorithms for Mining Top-K High Utility Itemsets , 2016, IEEE Transactions on Knowledge and Data Engineering.

[32]  Gordon Plotkin,et al.  A Further Note on Inductive Generalization , 2008 .

[33]  B. Baesens,et al.  Rule Extraction from Support Vector Machines: An Overview of Issues and Application in Credit Scoring , 2008, Rule Extraction from Support Vector Machines.

[34]  Chitta Baral Knowledge Representation, Reasoning and Declarative Problem Solving , 2003 .

[35]  Andreu Català,et al.  Rule extraction from support vector machines , 2002, ESANN.