Extracting boolean and probabilistic rules from trained neural networks

This paper presents two approaches to extracting rules from a trained neural network consisting of linear threshold functions. The first one leads to an algorithm that extracts rules in the form of Boolean functions. Compared with an existing one, this algorithm outputs much more concise rules if the threshold functions correspond to 1-decision lists, majority functions, or certain combinations of these. The second one extracts probabilistic rules representing relations between some of the input variables and the output using a dynamic programming algorithm. The algorithm runs in pseudo-polynomial time if each hidden layer has a constant number of neurons. We demonstrate the effectiveness of these two approaches by computational experiments.

[1]  Andrew Wuensche,et al.  A model of transcriptional regulatory networks based on biases in the observed regulation rules , 2002, Complex..

[2]  Hiroshi Tsukimoto,et al.  Extracting rules from trained neural networks , 2000, IEEE Trans. Neural Networks Learn. Syst..

[3]  Peter Barth Linear 0-1 Inequalities and Extended Clauses , 1993, LPAR.

[4]  Liang Zhao,et al.  Stereo- and neural network-based pedestrian detection , 2000, IEEE Trans. Intell. Transp. Syst..

[5]  N. Mastorakis,et al.  A fast computerized method for automatic simplification of boolean functions , 2009, ICONS 2009.

[6]  Donald C. Wunsch,et al.  Neural network explanation using inversion , 2007, Neural Networks.

[7]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[8]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[9]  M. Anthony Discrete Mathematics of Neural Networks: Selected Topics , 1987 .

[10]  Aurélien Naldi,et al.  Dynamical analysis of a generic Boolean model for the control of the mammalian cell cycle , 2006, ISMB.

[11]  Jacek M. Zurada,et al.  Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance , 2008, Neural Networks.

[12]  Peter L. Hammer,et al.  Decision Lists and Related Classes of Boolean Functions , 2010, Boolean Models and Methods.

[13]  T. Kathirvalavakumar,et al.  Reverse Engineering the Neural Networks for Rule Extraction in Classification Problems , 2011, Neural Processing Letters.

[14]  Genady Grabarnik,et al.  Sparse Modeling: Theory, Algorithms, and Applications , 2014 .

[15]  Ryan O'Donnell,et al.  Analysis of Boolean Functions , 2014, ArXiv.

[16]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[17]  Martin Wattenberg,et al.  SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[18]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[19]  Masumi Ishikawa Rule extraction by successive regularization , 2000, Neural Networks.

[20]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  Abdul Salam Jarrah,et al.  Nested Canalyzing, Unate Cascade, and Polynomial Functions. , 2006, Physica D. Nonlinear phenomena.

[22]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[23]  Tatsuya Akutsu,et al.  Identifying a Probabilistic Boolean Threshold Network From Samples , 2018, IEEE Transactions on Neural Networks and Learning Systems.