Selecting the Best Optimizing System

We formulate selecting the best optimizing system (SBOS) problems and provide solutions for those problems. In an SBOS problem, a finite number of systems are contenders. Inside each system, a continuous decision variable affects the system’s expected performance. An SBOS problem compares different systems based on their expected performances under their own optimally chosen decision to select the best, without advance knowledge of expected performances of the systems nor the optimizing decision inside each system. We design easy-to-implement algorithms that adaptively chooses a system and a choice of decision to evaluate the noisy system performance, sequentially eliminates inferior systems, and eventually recommends a system as the best after spending a user-specified budget. The proposed algorithms integrate the stochastic gradient descent method and the sequential elimination method to simultaneously exploit the structure inside each system and make comparisons across systems. For the proposed algorithms, we prove exponential rates of convergence to zero for the probability of false selection, as the budget grows to infinity. We conduct three numerical examples that represent three practical cases of SBOS problems. Our proposed algorithms demonstrate consistent and stronger performances in terms of the probability of false selection over benchmark algorithms under a range of problem settings and sampling budgets.

[1]  Xiaowei Zhang,et al.  Ranking and Selection with Covariates for Personalized Decision Making , 2017, INFORMS J. Comput..

[2]  Di Wu,et al.  Analyzing and provably improving fixed budget ranking and selection algorithms , 2018, ArXiv.

[3]  Ilya O. Ryzhov,et al.  On the Convergence Rates of Expected Improvement Methods , 2016, Oper. Res..

[4]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[5]  Xiaowei Zhang,et al.  Distributionally Robust Selection of the Best , 2019, Manag. Sci..

[6]  Warren B. Powell,et al.  The Knowledge-Gradient Policy for Correlated Normal Beliefs , 2009, INFORMS J. Comput..

[7]  Peter I. Frazier,et al.  Sequential Sampling with Economics of Selection Procedures , 2012, Manag. Sci..

[8]  Cynthia Rudin,et al.  The Big Data Newsvendor: Practical Insights from Machine Learning , 2013, Oper. Res..

[9]  Alessandro Lazaric,et al.  Multi-Bandit Best Arm Identification , 2011, NIPS.

[10]  Susan R. Hunter,et al.  Parallel Ranking and Selection , 2017 .

[11]  Martin J. Wainwright,et al.  Information-theoretic lower bounds on the oracle complexity of convex optimization , 2009, NIPS.

[12]  Barry L. Nelson,et al.  Indifference-Zone-Free Selection of the Best , 2016, Oper. Res..

[13]  Chun-Hung Chen,et al.  Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization , 2000, Discret. Event Dyn. Syst..

[14]  John Darzentas,et al.  Problem Complexity and Method Efficiency in Optimization , 1983 .

[15]  Peter I. Frazier,et al.  Performance measures for Ranking and Selection procedures , 2010, Proceedings of the 2010 Winter Simulation Conference.

[16]  Peter W. Glynn,et al.  A large deviations perspective on ordinal optimization , 2004, Proceedings of the 2004 Winter Simulation Conference, 2004..

[17]  George L. Nemhauser,et al.  Handbooks in operations research and management science , 1989 .

[18]  Dominik D. Freydenberger,et al.  Can We Learn to Gamble Efficiently? , 2010, COLT.

[19]  R. Bechhofer A Single-Sample Multiple Decision Procedure for Ranking Means of Normal Populations with known Variances , 1954 .

[20]  Ilya O. Ryzhov,et al.  Complete expected improvement converges to an optimal budget allocation , 2019, Advances in Applied Probability.

[21]  S. Sainati,et al.  An efficacy, safety, and dose-response study of Ramelteon in patients with chronic primary insomnia. , 2006, Sleep medicine.

[22]  Csaba Szepesvari,et al.  Bandit Algorithms , 2020 .

[23]  Maqbool Dada,et al.  Pricing and the Newsvendor Problem: A Review with Extensions , 1999, Oper. Res..

[24]  Stephen E. Chick,et al.  Bayesian Sequential Learning for Clinical Trials of Multiple Correlated Medical Interventions , 2018, Manag. Sci..

[25]  L. Pekelis,et al.  Always Valid Inference: Continuous Monitoring of A/B Tests , 2021, Oper. Res..

[26]  Guanghui Lan,et al.  First-order and Stochastic Optimization Methods for Machine Learning , 2020 .

[27]  K. Arrow,et al.  Optimal Inventory Policy. , 1951 .

[28]  G. A. Young,et al.  High‐dimensional Statistics: A Non‐asymptotic Viewpoint, Martin J.Wainwright, Cambridge University Press, 2019, xvii 552 pages, £57.99, hardback ISBN: 978‐1‐1084‐9802‐9 , 2020, International Statistical Review.

[29]  Georgia Perakis,et al.  The Data-Driven Newsvendor Problem: New Bounds and Insights , 2015, Oper. Res..

[30]  Amy R. Ward,et al.  Optimal pricing and capacity sizing for the GI/GI/1 queue , 2014, Oper. Res. Lett..

[31]  Amy R. Ward,et al.  Pricing and Capacity Sizing of a Service Facility: Customer Abandonment Effects , 2019 .

[32]  Ramandeep S. Randhawa,et al.  The Value of Dynamic Pricing in Large Queueing Systems , 2018, Oper. Res..

[33]  T.C.E. Cheng,et al.  Novel advances in applications of the newsvendor model , 2016 .

[35]  Yaozhong Wu,et al.  Selection Procedures with Frequentist Expected Opportunity Cost Bounds , 2005, Oper. Res..

[36]  Peter I. Frazier,et al.  A Fully Sequential Elimination Procedure for Indifference-Zone Ranking and Selection with Tight Bounds on Probability of Correct Selection , 2014, Oper. Res..

[37]  Dimitris Bertsimas,et al.  A Data-Driven Approach To Newsvendor Problems , 2005 .

[38]  Jun Luo,et al.  Fully Sequential Procedures for Large-Scale Ranking-and-Selection Problems in Parallel Computing Environments , 2015, Oper. Res..

[39]  Barry L. Nelson,et al.  Selecting the best system when systems are revealed sequentially , 2007 .

[40]  Daniel Russo,et al.  Simple Bayesian Algorithms for Best Arm Identification , 2016, COLT.

[41]  Barry L. Nelson,et al.  A Confidence Interval Procedure for Expected Shortfall Risk Measurement via Two-Level Simulation , 2010, Oper. Res..

[42]  J. Ménard,et al.  Randomized Dose-Response Study of the New Dual Endothelin Receptor Antagonist Aprocitentan in Hypertension , 2020, Hypertension.

[43]  Sebastian Müller,et al.  A Data-Driven Newsvendor Problem: From Data to Decision , 2019, Eur. J. Oper. Res..

[44]  Jürgen Branke,et al.  Sequential Sampling to Myopically Maximize the Expected Value of Information , 2010, INFORMS J. Comput..

[45]  Yijie Peng,et al.  Context-dependent Ranking and Selection under a Bayesian Framework. , 2020 .

[46]  Alexandra Carpentier,et al.  Tight (Lower) Bounds for the Fixed Budget Best Arm Identification Bandit Problem , 2016, COLT.