Implementation and evaluation of nonparametric regression procedures for sensitivity analysis of computationally demanding models

The analysis of many physical and engineering problems involves running complex computational models (simulation models, computer codes). With problems of this type, it is important to understand the relationships between the input variables (whose values are often imprecisely known) and the output. The goal of sensitivity analysis (SA) is to study this relationship and identify the most significant factors or variables affecting the results of the model. In this presentation, an improvement on existing methods for SA of complex computer models is described for use when the model is too computationally expensive for a standard Monte-Carlo analysis. In these situations, a meta-model or surrogate model can be used to estimate the necessary sensitivity index for each input. A sensitivity index is a measure of the variance in the response that is due to the uncertainty in an input. Most existing approaches to this problem either do not work well with a large number of input variables and/or they ignore the error involved in estimating a sensitivity index. Here, a new approach to sensitivity index estimation using meta-models and bootstrap confidence intervals is described that provides solutions to these drawbacks. Further, an efficient yet effective approach to incorporate this methodology into an actual SA is presented. Several simulated and real examples illustrate the utility of this approach. This framework can be extended to uncertainty analysis as well.

[1]  C. HeltonJ.,et al.  Performance Assessment for the Waste Isolation Pilot Plant , 1997 .

[2]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[3]  Jon C. Helton,et al.  Calculation of reactor accident safety goals , 1993 .

[4]  C. F. Jeff Wu,et al.  Experiments: Planning, Analysis, and Parameter Design Optimization , 2000 .

[5]  Jon C. Helton,et al.  Characterization of subjective uncertainty in the 1996 performance assessment for the Waste Isolation Pilot Plant , 2000, Reliab. Eng. Syst. Saf..

[6]  Grace Wahba,et al.  Spline Models for Observational Data , 1990 .

[7]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1990, COLT '90.

[8]  J. Couper Sensitivity and Uncertainty Analysis , 2003 .

[9]  M. D. McKay,et al.  A comparison of three methods for selecting values of input variables in the analysis of output from a computer code , 2000 .

[10]  M. J. Bayarri,et al.  Computer model validation with functional output , 2007, 0711.3271.

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  Michael Andrew Christie,et al.  Error analysis and simulations of complex phenomena , 2005 .

[13]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[14]  G. Matheron Principles of geostatistics , 1963 .

[15]  Brian J Reich,et al.  Surface Estimation, Variable Selection, and the Nonparametric Oracle Property. , 2011, Statistica Sinica.

[16]  John M. Chambers,et al.  Graphical Methods for Data Analysis , 1983 .

[17]  Joseph A. C. Delaney Sensitivity analysis , 2018, The African Continental Free Trade Area: Economic and Distributional Effects.

[18]  J. C. Helton,et al.  Statistical Analyses of Scatterplots to Identify Important Factors in Large-Scale Simulations, 1: Review and Comparison of Techniques , 1999 .

[19]  Henry P. Wynn,et al.  Screening, predicting, and computer experiments , 1992 .

[20]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[21]  F. J. Davis,et al.  Illustration of Sampling‐Based Methods for Uncertainty and Sensitivity Analysis , 2002, Risk analysis : an official publication of the Society for Risk Analysis.

[22]  Jon C. Helton,et al.  Multiple predictor smoothing methods for sensitivity analysis: Example results , 2008, Reliab. Eng. Syst. Saf..

[23]  Jon C. Helton,et al.  Representation of two-phase flow in the vicinity of the repository in the 1996 performance assessment for the Waste Isolation Pilot Plant , 2000, Reliab. Eng. Syst. Saf..

[24]  Thomas J. Santner,et al.  The Design and Analysis of Computer Experiments , 2003, Springer Series in Statistics.

[25]  Jon C. Helton,et al.  The 1996 performance assessment for the Waste Isolation Pilot Plant , 1998, Reliability Engineering & System Safety.

[26]  Jon C. Helton,et al.  Sampling-based methods for uncertainty and sensitivity analysis. , 2000 .

[27]  Wolfgang Härdle,et al.  Applied Nonparametric Regression , 1991 .

[28]  Saltelli Andrea,et al.  Global Sensitivity Analysis: The Primer , 2008 .

[29]  Hao Helen Zhang,et al.  Component selection and smoothing in multivariate nonparametric regression , 2006, math/0702659.

[30]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[31]  R. Eubank Nonparametric Regression and Spline Smoothing , 1999 .

[32]  Kenny Q. Ye,et al.  Variable Selection for Gaussian Process Models in Computer Experiments , 2006, Technometrics.

[33]  Bertrand Iooss,et al.  An efficient methodology for modeling complex computer codes with Gaussian processes , 2008, Comput. Stat. Data Anal..

[34]  D. Cacuci,et al.  SENSITIVITY and UNCERTAINTY ANALYSIS , 2003 .

[35]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[36]  James W. Wisnowski,et al.  Smoothing and Regression: Approaches, Computation, and Application , 2002 .

[37]  R. Iman,et al.  The Use of the Rank Transform in Regression , 1979 .

[38]  J. C. Helton,et al.  An Investigation of Uncertainty and Sensitivity Analysis Techniques for Computer Models , 1988 .

[39]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[40]  Jon C. Helton,et al.  Multiple predictor smoothing methods for sensitivity analysis: Description of techniques , 2008, Reliab. Eng. Syst. Saf..

[41]  Hans C. van Houwelingen,et al.  The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani and Jerome Friedman, Springer, New York, 2001. No. of pages: xvi+533. ISBN 0‐387‐95284‐5 , 2004 .

[42]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[43]  John Durkin,et al.  Tools and applications , 2002 .

[44]  Rhodri Hayward,et al.  Screening , 2008, The Lancet.

[45]  Beat Kleiner,et al.  Graphical Methods for Data Analysis , 1983 .

[46]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[47]  P. Hall On the Bootstrap and Confidence Intervals , 1986 .

[48]  Kathleen V. Diegert,et al.  Error and uncertainty in modeling and simulation , 2002, Reliab. Eng. Syst. Saf..

[49]  A. O'Hagan,et al.  Probabilistic sensitivity analysis of complex models: a Bayesian approach , 2004 .

[50]  M. B. Beck,et al.  Water quality modeling: A review of the analysis of uncertainty , 1987 .

[51]  A. OHagan,et al.  Bayesian analysis of computer code outputs: A tutorial , 2006, Reliab. Eng. Syst. Saf..

[52]  A. Saltelli,et al.  Importance measures in global sensitivity analysis of nonlinear models , 1996 .

[53]  T. Turányi Sensitivity analysis of complex kinetic systems. Tools and applications , 1990 .

[54]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[55]  R. Tibshirani,et al.  Generalized Additive Models , 1991 .

[56]  Jon C. Helton,et al.  Survey of sampling-based methods for uncertainty and sensitivity analysis , 2006, Reliab. Eng. Syst. Saf..

[57]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[58]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[59]  Robert B. Gramacy,et al.  Ja n 20 08 Bayesian Treed Gaussian Process Models with an Application to Computer Modeling , 2009 .

[60]  Michael L. Stein,et al.  Interpolation of spatial data , 1999 .

[61]  Hao Helen Zhang,et al.  Component selection and smoothing in smoothing spline analysis of variance models -- COSSO , 2003 .

[62]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[63]  Jon C. Helton,et al.  Uncertainty and sensitivity analysis for two-phase flow in the vicinity of the repository in the 1996 performance assessment for the Waste Isolation Pilot Plant: undisturbed conditions , 2000, Reliab. Eng. Syst. Saf..

[64]  J. Friedman Multivariate adaptive regression splines , 1990 .

[65]  J. Davenport Editor , 1960 .

[66]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .

[67]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[68]  Curtis B. Storlie,et al.  Variable Selection in Bayesian Smoothing Spline ANOVA Models: Application to Deterministic Computer Codes , 2009, Technometrics.

[69]  Garrett Dancik mlegp: an R package for Gaussian process modeling and sensitivity analysis , 2007 .

[70]  Chong Gu Smoothing Spline Anova Models , 2002 .

[71]  Jon C. Helton,et al.  Uncertainty and sensitivity analysis for two-phase flow in the vicinity of the repository in the 1996 performance assessment for the Waste Isolation Pilot Plant: undisturbed conditions , 2000, Reliab. Eng. Syst. Saf..

[72]  Kjell A. Doksum,et al.  Nonparametric Estimation of Global Functionals and a Measure of the Explanatory Power of Covariates in Regression , 1995 .

[73]  Jerome H. Friedman Multivariate adaptive regression splines (with discussion) , 1991 .