论文信息 - Improving the Expected Improvement Algorithm - 字舞流文

Improving the Expected Improvement Algorithm

The expected improvement (EI) algorithm is a popular strategy for information collection in optimization under uncertainty. The algorithm is widely known to be too greedy, but nevertheless enjoys wide use due to its simplicity and ability to handle uncertainty and noise in a coherent decision theoretic framework. To provide rigorous insight into EI, we study its properties in a simple setting of Bayesian optimization where the domain consists of a finite grid of points. This is the so-called best-arm identification problem, where the goal is to allocate measurement effort wisely to confidently identify the best arm using a small number of measurements. In this framework, one can show formally that EI is far from optimal. To overcome this shortcoming, we introduce a simple modification of the expected improvement algorithm. Surprisingly, this simple change results in an algorithm that is asymptotically optimal for Gaussian best-arm identification problems, and provably outperforms standard EI by an order of magnitude.

Diego Klabjan | Chao Qin | Daniel Russo | Daniel Russo | D. Klabjan | Chao Qin

[1] Aurélien Garivier,et al. Optimal Best Arm Identification with Fixed Confidence , 2016, COLT.

[2] Ilya O. Ryzhov,et al. On the Convergence Rates of Expected Improvement Methods , 2016, Oper. Res..

[3] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..

[4] Warren B. Powell,et al. A Knowledge-Gradient Policy for Sequential Information Collection , 2008, SIAM J. Control. Optim..

[5] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.

[6] Walter T. Federer,et al. Sequential Design of Experiments , 1967 .

[7] Peter W. Glynn,et al. A large deviations perspective on ordinal optimization , 2004, Proceedings of the 2004 Winter Simulation Conference, 2004..

[8] I. Johnstone,et al. ASYMPTOTICALLY OPTIMAL PROCEDURES FOR SEQUENTIAL ADAPTIVE SELECTION OF THE BEST OF SEVERAL NORMAL MEANS , 1982 .

[9] Alessandro Lazaric,et al. Best-Arm Identification in Linear Bandits , 2014, NIPS.

[10] Daniel Russo,et al. Simple Bayesian Algorithms for Best Arm Identification , 2016, COLT.

[11] T. Lai,et al. Self-Normalized Processes: Limit Theory and Statistical Applications , 2001 .

[12] Chun-Hung Chen,et al. Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization , 2000, Discret. Event Dyn. Syst..

[13] Adam D. Bull,et al. Convergence Rates of Efficient Global Optimization Algorithms , 2011, J. Mach. Learn. Res..

[14] Oren Somekh,et al. Almost Optimal Exploration in Multi-Armed Bandits , 2013, ICML.

[15] Shivaram Kalyanakrishnan,et al. Information Complexity in Bandit Subset Selection , 2013, COLT.

[16] R. Munos,et al. Best Arm Identification in Multi-Armed Bandits , 2010, COLT.

[17] Matthew Malloy,et al. lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits , 2013, COLT.

[18] Csaba Szepesvári,et al. Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits , 2012, AISTATS.

[19] Robert D. Nowak,et al. Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting , 2014, 2014 48th Annual Conference on Information Sciences and Systems (CISS).

[20] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..

[21] Aurélien Garivier,et al. On the Complexity of A/B Testing , 2014, COLT.

[22] Alessandro Lazaric,et al. Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence , 2012, NIPS.

[23] Donald R. Jones,et al. Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[24] Nando de Freitas,et al. Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.