Creating Powerful and Interpretable Models withRegression Networks

As the discipline has evolved, research in machine learning has been focused more and more on creating more powerful neural networks, without regard for the interpretability of these networks. Such “black-box models” yield state-of-the-art results, but we cannot understand why they make a particular decision or prediction. Sometimes this is acceptable, but often it is not. We propose a novel architecture, Regression Networks, which combines the power of neural networks with the understandability of regression analysis. While some methods for combining these exist in the literature, our architecture generalizes these approaches by taking interactions into account, offering the power of a dense neural network without forsaking interpretability. We demonstrate that the models exceed the state-of-the-art performance of interpretable models on several benchmark datasets, matching the power of a dense neural network. Finally, we discuss how these techniques can be generalized to other neural architectures, such as convolutional and recurrent neural networks.

[1]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[2]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[3]  M. Elter,et al.  The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. , 2007, Medical physics.

[4]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[5]  S. Lo Piano,et al.  Ethical principles in machine learning and artificial intelligence: cases from the field and possible ways forward , 2020, Humanities and Social Sciences Communications.

[6]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[7]  Beata Strack,et al.  Impact of HbA1c Measurement on Hospital Readmission Rates: Analysis of 70,000 Clinical Database Patient Records , 2014, BioMed research international.

[8]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[9]  R. Pace,et al.  Sparse spatial autoregressions , 1997 .

[10]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[11]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[12]  Harmanpreet Kaur,et al.  Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning , 2020, CHI.

[13]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[14]  Seth Flaxman,et al.  European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..

[15]  Hany Farid,et al.  The accuracy, fairness, and limits of predicting recidivism , 2018, Science Advances.

[16]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[17]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[18]  J. Brian Gray,et al.  Introduction to Linear Regression Analysis , 2002, Technometrics.

[19]  Geoffrey E. Hinton,et al.  Neural Additive Models: Interpretable Machine Learning with Neural Nets , 2020, NeurIPS.

[20]  D. Rubinfeld,et al.  Hedonic housing prices and the demand for clean air , 1978 .

[21]  Fenglei Fan,et al.  On Interpretability of Artificial Neural Networks: A Survey , 2020, IEEE Transactions on Radiation and Plasma Medical Sciences.

[22]  Rich Caruana,et al.  InterpretML: A Unified Framework for Machine Learning Interpretability , 2019, ArXiv.