Tutorial: Optimization via simulation with Bayesian statistics and dynamic programming

Bayesian statistics comprises a powerful set of methods for analyzing simulated systems. Combined with dynamic programming and other methods for sequential decision making under uncertainty, Bayesian methods have been used to design algorithms for finding the best of several simulated systems. When the dynamic program can be solved exactly, these algorithms have optimal average-case performance. In other situations, this dynamic programming analysis supports the development of approximate methods with sub-optimal but nevertheless good average-case performance. These methods with good average-case performance are particularly useful when the cost of simulation prevents the use of procedures with worst-case statistical performance guarantees. We provide an overview of Bayesian methods used for selecting the best, providing an in-depth treatment of the simpler case of ranking and selection with independent priors appropriate for smaller-scale problems, and then discussing how these same ideas can be applied to correlated priors appropriate for large-scale problems.

[1]  Peter I. Frazier,et al.  Sequential Sampling with Economics of Selection Procedures , 2012, Manag. Sci..

[2]  Jürgen Branke,et al.  Sequential Sampling to Myopically Maximize the Expected Value of Information , 2010, INFORMS J. Comput..

[3]  Warren B. Powell,et al.  Optimal Learning: Powell/Optimal , 2012 .

[4]  Enver Yücesan,et al.  Discrete-event simulation optimization using ranking, selection, and multiple comparison procedures: A survey , 2003, TOMC.

[5]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[6]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[7]  M.C. Fu,et al.  Simulation optimization , 2001, Proceeding of the 2001 Winter Simulation Conference (Cat. No.01CH37304).

[8]  Stephen E. Chick,et al.  Economic Analysis of Simulation Selection Problems , 2009, Manag. Sci..

[9]  Warren B. Powell,et al.  The knowledge-gradient stopping rule for ranking and selection , 2008, 2008 Winter Simulation Conference.

[10]  Warren B. Powell,et al.  The Knowledge-Gradient Policy for Correlated Normal Beliefs , 2009, INFORMS J. Comput..

[11]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[12]  Paul W. Goldberg,et al.  Regression with Input-dependent Noise: A Gaussian Process Treatment , 1997, NIPS.

[13]  B PowellWarren,et al.  A Knowledge-Gradient Policy for Sequential Information Collection , 2008 .

[14]  Barry L. Nelson,et al.  Chapter 17 Selecting the Best System , 2006, Simulation.

[15]  Peter I. Frazier,et al.  Value of information methods for pairwise sampling with correlations , 2011, Proceedings of the 2011 Winter Simulation Conference (WSC).

[16]  Sigurdur Olafsson,et al.  Simulation optimization , 2002, Proceedings of the Winter Simulation Conference.

[17]  Onésimo Hernández-Lerma,et al.  Controlled Markov Processes , 1965 .

[18]  Howard Raiffa,et al.  Applied Statistical Decision Theory. , 1961 .

[19]  Peter I. Frazier,et al.  Sequential Bayes-Optimal Policies for Multiple Comparisons with a Known Standard , 2013, Oper. Res..

[20]  Warren B. Powell,et al.  Paradoxes in Learning and the Marginal Value of Information , 2010, Decis. Anal..

[21]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[22]  Stephen E. Chick Bayesian methods: bayesian methods for simulation , 2000, WSC '00.

[23]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[24]  Warren B. Powell,et al.  Optimal Learning , 2022, Encyclopedia of Machine Learning and Data Mining.

[25]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[26]  Jason R. W. Merrick Bayesian Simulation and Decision Analysis: An Expository Survey , 2009, Decis. Anal..

[27]  Peter I. Frazier,et al.  A Framework for Selecting a Selection Procedure , 2012, TOMC.

[28]  R. Bechhofer A Single-Sample Multiple Decision Procedure for Ranking Means of Normal Populations with known Variances , 1954 .

[29]  Michael C. Fu,et al.  Optimization via simulation: A review , 1994, Ann. Oper. Res..

[30]  Stephen E. Chick,et al.  Bayesian methods for simulation , 2000, 2000 Winter Simulation Conference Proceedings (Cat. No.00CH37165).

[31]  Sheldon M. Ross,et al.  Introduction to Stochastic Dynamic Programming: Probability and Mathematical , 1983 .

[32]  D. Solomon,et al.  Applied Statistical Decision Theory. , 1961 .

[33]  Michael C. Fu,et al.  Feature Article: Optimization for simulation: Theory vs. Practice , 2002, INFORMS J. Comput..

[34]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[35]  K. Judd Numerical methods in economics , 1998 .

[36]  Howard Raiffa,et al.  Applied Statistical Decision Theory. , 1961 .

[37]  Stephen E. Chick,et al.  New Two-Stage and Sequential Procedures for Selecting the Best Simulated System , 2001, Oper. Res..

[38]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[39]  Louis Anthony Cox,et al.  Wiley encyclopedia of operations research and management science , 2011 .

[40]  Jürgen Branke,et al.  New greedy myopic and existing asymptotic sequential selection procedures: preliminary empirical results , 2007, 2007 Winter Simulation Conference.

[41]  Lee W. Schruben,et al.  A survey of simulation optimization techniques and procedures , 2000, 2000 Winter Simulation Conference Proceedings (Cat. No.00CH37165).

[42]  D. Coleman Statistical Process Control—Theory and Practice , 1993 .

[43]  R. H. Smith Optimization for Simulation : Theory vs . Practice , 2002 .

[44]  A. Tamhane Design and Analysis of Experiments for Statistical Selection, Screening, and Multiple Comparisons , 1995 .

[45]  P. W. Jones,et al.  Bandit Problems, Sequential Allocation of Experiments , 1987 .

[46]  Noel A Cressie,et al.  Statistics for Spatial Data, Revised Edition. , 1994 .

[47]  Ronald A. Howard,et al.  Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..

[48]  Barry L. Nelson,et al.  Recent advances in ranking and selection , 2007, 2007 Winter Simulation Conference.

[49]  G. Barrie Wetherill,et al.  Statistical Process Control , 1991 .

[50]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[51]  S. Gupta,et al.  Bayesian look ahead one-stage sampling allocations for selection of the best population , 1996 .

[52]  Warren B. Powell,et al.  A Knowledge-Gradient Policy for Sequential Information Collection , 2008, SIAM J. Control. Optim..

[53]  Barry L. Nelson,et al.  Stochastic kriging for simulation metamodeling , 2008, 2008 Winter Simulation Conference.

[54]  Stephen E. Chick,et al.  Bayesian Ideas and Discrete Event Simulation: Why, What and How , 2006, Proceedings of the 2006 Winter Simulation Conference.

[55]  M. Degroot Optimal Statistical Decisions , 1970 .