ACCURACY VERSUS INTERPRETABILITY IN FLEXIBLE MODELING : IMPLEMENTING A TRADEOFF USING GAUSSIAN PROCESS MODELS

One of the widely acknowledged drawbacks of flexible statistical models is that the fitted models are often extremely difficult to interpret. However, if flexible models are constrained to be additive the fitted models are much easier to interpret, as each input can be considered independently. The problem with additive models is that they cannot provide an accurate model if the phenomenon being modeled is not additive. This paper shows that a tradeoff between accuracy and additivity can be implemented easily in Gaussian process models, which are a type of flexible model closely related to feedforward neural networks. One can fit a series of Gaussian process models that begins with the completely flexible and are constrained to be progressively more additive, and thus progressively more interpretable. Observations of how the degree of non-additivity and the test error change as the models become more additive give insight into the importance of interactions in a particular model. Fitted models in the series can also be interpreted graphically with a technique for visualizing the effects of inputs in non-additive models that was adapted from plots for generalized additive models. This visualization technique shows the overall effects of different inputs and also shows which inputs are involved in interactions and how strong those interactions are.

[1]  G. Matheron Principles of geostatistics , 1963 .

[2]  R. Nakano,et al.  Medical diagnostic expert system based on PDP model , 1988, IEEE 1988 International Conference on Neural Networks.

[3]  Dennis Sanger,et al.  Contribution analysis: a technique for assigning responsibilities to hidden units in connectionist networks , 1991 .

[4]  G. Wahba Spline models for observational data , 1990 .

[5]  Tomaso A. Poggio,et al.  Extensions of a Theory of Networks for Approximation and Learning , 1990, NIPS.

[6]  R. Tibshirani,et al.  Generalized Additive Models , 1991 .

[7]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[8]  W. Baxt Analysis of the clinical variables driving decision in an artificial neural network trained to identify the presence of myocardial infarction. , 1992, Annals of emergency medicine.

[9]  Chong Gu,et al.  Structured Machine Learning for Soft Classification with Smoothing Spline ANOVA and Stacked Tuning, Testing, and Evaluation , 1993, NIPS.

[10]  L. Moseholm,et al.  Pulmonary function changes in asthmatics associated with low‐level SO2 and NO2 air pollution, weather, and medicine intake , 1993, Allergy.

[11]  Reinhard Blasig GDS: Gradient Descent Generation of Symbolic Classification Rules , 1993, NIPS.

[12]  Jeffrey L. Elman,et al.  Analyzing Cross-Connected Networks , 1993, NIPS.

[13]  Jenq-Neng Hwang,et al.  Regression modeling in back-propagation and projection pursuit learning , 1994, IEEE Trans. Neural Networks.

[14]  Michael C. Mozer,et al.  Template-Based Algorithms for Connectionist Rule Extraction , 1994, NIPS.

[15]  Thomas R. Shultz,et al.  Analysis of Unstandardized Contributions in Cross Connected Networks , 1994, NIPS.

[16]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[17]  Jeremy Wyatt,et al.  Nervous about artificial neural networks? , 1995, The Lancet.

[18]  Halbert White,et al.  Bootstrapping Confidence Intervals for Clinical Input Variable Effects in a Network Trained to Identify the Presence of Acute Myocardial Infarction , 1995, Neural Computation.

[19]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[20]  David Sharp From "black box" to bedside, one day , 1995, The Lancet.

[21]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[22]  Hans Henrik Thodberg,et al.  A review of Bayesian neural networks with an application to near infrared spectroscopy , 1996, IEEE Trans. Neural Networks.

[23]  John E. Moody,et al.  Smoothing Regularizers for Projective Basis Function Networks , 1996, NIPS.

[24]  David Mackay,et al.  Gaussian Processes - A Replacement for Supervised Neural Networks? , 1997 .

[25]  Geoffrey E. Hinton,et al.  Evaluation of Gaussian processes and other methods for non-linear regression , 1997 .

[26]  M. Gibbs,et al.  Efficient implementation of gaussian processes , 1997 .

[27]  Radford M. Neal Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification , 1997, physics/9701026.

[28]  Tony Plate,et al.  Visualizing the Function Computed by a Feedforward Neural Network , 2000, Neural Computation.