An Information-Theoretic Approach to Optimally Calibrate Approximate Models

With the advancements in modeling and numerical algorithms, the decision supported by modeling and simulation has became more mainstream than ever. Even though the computational power is continually increased, in most engineering applications the problem of optimal design under uncertainty has became prohibitively expensive due to long runtimes of single simulations. The obvious solution is to reduce the complexity of the model by employing di erent assumptions and constructing this way an approximate model. The calibration of these simpler models requires a large number of runs of the complex model, which may still be too expensive and ine cient for the task at hand. In this paper, we study the problem of optimal data collection to e ciently learn the model parameters of an approximate model in the context of Bayesian analysis. The paper emphasizes the in uence of model discrepancy on the calibration of the approximate model and hence the choice of optimal designs. Model discrepancy is modeled using a Gaussian process in this study. The optimal design is obtained as a result of an information theoretic sensitivity analysis. Thus, the preferred design is where the statistical dependence between the model parameters and observables is the highest possible. In this paper, the statistical dependence between random variables is quanti ed by mutual information and estimated using a k-nearest neighbor based approximation. As a model problem, a convective-dispersion model is calibrated to approximate the physics of Burgers’ equation in a limited time domain of interest.

[1]  D. Lindley On a Measure of the Information Provided by an Experiment , 1956 .

[2]  K. Chaloner,et al.  Bayesian Experimental Design: A Review , 1995 .

[3]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[5]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .

[6]  S. Saigal,et al.  Relative performance of mutual information estimation methods for quantifying the dependence among short and noisy data. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[8]  Sandia Report,et al.  Statistical Validation of Engineering and Scientific Models: Bounds, Calibration, and Extrapolation , 2005 .

[9]  Henry P. Wynn,et al.  Maximum entropy sampling , 1987 .

[10]  Shapour Azarm,et al.  A SEQUENTIAL INFORMATION-THEORETIC APPROACH TO DESIGN OF COMPUTER EXPERIMENTS , 2002 .

[11]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[12]  Bayesian Adaptive Exploration , 2004, astro-ph/0409386.

[13]  Sai Hung Cheung,et al.  New Bayesian Updating Methodology for Model Validation and Robust Predictions Based on Data from Hierarchical Subsystem Tests , 2008 .

[14]  Gabriel Terejanu,et al.  Bayesian experimental design for the active nitridation of graphite by atomic nitrogen , 2011, ArXiv.

[15]  H. Wynn,et al.  Maximum entropy sampling and optimal Bayesian experimental design , 2000 .

[16]  Liam Paninski,et al.  Asymptotic Theory of Information-Theoretic Experimental Design , 2005, Neural Computation.

[17]  Renato Vicente,et al.  An information-theoretic approach to statistical dependence: Copula information , 2009, ArXiv.