Sequential Bayesian optimal experimental design via approximate dynamic programming

The design of multiple experiments is commonly undertaken via suboptimal strategies, such as batch (open-loop) design that omits feedback or greedy (myopic) design that does not account for future effects. This paper introduces new strategies for the optimal design of sequential experiments. First, we rigorously formulate the general sequential optimal experimental design (sOED) problem as a dynamic program. Batch and greedy designs are shown to result from special cases of this formulation. We then focus on sOED for parameter inference, adopting a Bayesian formulation with an information theoretic design objective. To make the problem tractable, we develop new numerical approaches for nonlinear design with continuous parameter, design, and observation spaces. We approximate the optimal policy by using backward induction with regression to construct and refine value function approximations in the dynamic program. The proposed algorithm iteratively generates trajectories via exploration and exploitation to improve approximation accuracy in frequently visited regions of the state space. Numerical results are verified against analytical solutions in a linear-Gaussian setting. Advantages over batch and greedy design are then demonstrated on a nonlinear source inversion problem where we seek an optimal policy for sequential sensing.

[1]  J. I The Design of Experiments , 1936, Nature.

[2]  R Bellman,et al.  Bottleneck Problems and Dynamic Programming. , 1953, Proceedings of the National Academy of Sciences of the United States of America.

[3]  R Bellman,et al.  DYNAMIC PROGRAMMING AND LAGRANGE MULTIPLIERS. , 1956, Proceedings of the National Academy of Sciences of the United States of America.

[4]  D. Lindley On a Measure of the Information Provided by an Experiment , 1956 .

[5]  L. Goddard Information Theory , 1962, Nature.

[6]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[7]  George E. P. Box,et al.  Empirical Model‐Building and Response Surfaces , 1988 .

[8]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[9]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[10]  藤重 悟 Submodular functions and optimization , 1991 .

[11]  Anthony C. Atkinson,et al.  Optimum Experimental Designs , 1992 .

[12]  W. Näther Optimum experimental designs , 1994 .

[13]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[14]  Geoffrey J. Gordon Stable Function Approximation in Dynamic Programming , 1995, ICML.

[15]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[16]  K. Chaloner,et al.  Bayesian Experimental Design: A Review , 1995 .

[17]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[18]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[19]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[20]  John N. Tsitsiklis,et al.  Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[21]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[22]  A E Gelfand,et al.  Approaches for optimal sequential decision analysis in clinical trials. , 1998, Biometrics.

[23]  John N. Tsitsiklis,et al.  Regression methods for pricing complex American-style options , 2001, IEEE Trans. Neural Networks.

[24]  Dechang Chen,et al.  The Theory of the Design of Experiments , 2001, Technometrics.

[25]  Michael C. Caramanis,et al.  Sequential DOE via dynamic programming , 2002 .

[26]  Michael D. Smith,et al.  Adaptive Bayesian Designs for Dose-Ranging Drug Trials , 2002 .

[27]  S. Murphy,et al.  Optimal dynamic treatment regimes , 2003 .

[28]  Joseph B. Kadane,et al.  A Gridding Method for Bayesian Sequential Decision Problems , 2003 .

[29]  Michail G. Lagoudakis,et al.  Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[30]  J. Andrés Christen,et al.  Sequential stopping rules for species accumulation , 2003 .

[31]  K. J. Ryan,et al.  Estimating Expected Information Gains for Experimental Designs With Application to the Random Fatigue-Limit Model , 2003 .

[32]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[33]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[34]  Andreas Krause,et al.  Near-optimal sensor placements in Gaussian processes , 2005, ICML.

[35]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[36]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[37]  Liming Xiang,et al.  Kernel-Based Reinforcement Learning , 2006, ICIC.

[38]  J. Andrés Christen,et al.  Implementation of Backward Induction for Sequentially Adaptive Clinical Trials , 2006 .

[39]  P. Müller,et al.  Simulation-Based Sequential Bayesian Design 1 , 2006 .

[40]  Donald A. Berry,et al.  Simulation-based sequential Bayesian design , 2007 .

[41]  Josep Ginebra,et al.  On the measure of the information in a statistical experiment , 2007 .

[42]  Hovav A. Dror,et al.  Sequential Experimental Designs for Generalized Linear Models , 2008 .

[43]  KrauseAndreas,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008 .

[44]  C. Villani Optimal Transport: Old and New , 2008 .

[45]  Andreas Krause,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[46]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[47]  Mark A. Pitt,et al.  Adaptive Design Optimization: A Mutual Information-Based Approach to Model Discrimination in Cognitive Science , 2010, Neural Computation.

[48]  U. Toussaint,et al.  Bayesian inference in physics , 2011 .

[49]  Gabriel Terejanu,et al.  Bayesian experimental design for the active nitridation of graphite by atomic nitrogen , 2011, ArXiv.

[50]  Thomas J. Loredo,et al.  Rotating Stars and Revolving Planets: Bayesian Exploration of the Pulsating Sky , 2011, 1107.5805.

[51]  Heikki Haario,et al.  Simulation-Based Optimal Design Using a Response Variance Criterion , 2012 .

[52]  Youssef M. Marzouk,et al.  Bayesian inference with optimal maps , 2011, J. Comput. Phys..

[53]  Raul Tempone,et al.  Fast estimation of expected information gains for Bayesian experimental designs based on Laplace approximations , 2013 .

[54]  James M. McGree,et al.  Sequential Monte Carlo for Bayesian sequentially designed experiments for discrete data , 2013, Comput. Stat. Data Anal..

[55]  Xun Huan,et al.  Simulation-based optimal Bayesian experimental design for nonlinear systems , 2011, J. Comput. Phys..

[56]  Anthony N. Pettitt,et al.  A Sequential Monte Carlo Algorithm to Incorporate Model Uncertainty in Bayesian Sequential Design , 2014 .

[57]  X. Huan,et al.  GRADIENT-BASED STOCHASTIC OPTIMIZATION METHODS IN BAYESIAN EXPERIMENTAL DESIGN , 2012, 1212.2228.

[58]  Mark A. Pitt,et al.  A Hierarchical Adaptive Approach to Optimal Experimental Design , 2014, Neural Computation.

[59]  Marco Pavone,et al.  Stochastic Optimal Control , 2015 .

[60]  Xun Huan,et al.  Numerical approaches for sequential Bayesian optimal experimental design , 2015 .

[61]  Christine M. Anderson-Cook,et al.  Computational Enhancements to Bayesian Design of Experiments Using Gaussian Processes , 2016 .

[62]  R. Tempone,et al.  Optimal Bayesian Experimental Design for Priors of Compact Support with Application to Shock‐Tube Experiments for Combustion Kinetics , 2016 .

[63]  Y. Marzouk,et al.  An introduction to sampling via measure transport , 2016, 1602.05023.

[64]  Georg Stadler,et al.  A Fast and Scalable Method for A-Optimal Design of Experiments for Infinite-dimensional Bayesian Nonlinear Inverse Problems , 2014, SIAM J. Sci. Comput..

[65]  Youssef Marzouk,et al.  Transport Map Accelerated Markov Chain Monte Carlo , 2014, SIAM/ASA J. Uncertain. Quantification.