Hierarchical Knowledge Gradient for Sequential Sampling
暂无分享,去创建一个
[1] R. Bechhofer. A Single-Sample Multiple Decision Procedure for Ranking Means of Normal Populations with known Variances , 1954 .
[2] Harold J. Kushner,et al. A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise , 1964 .
[3] Russell R. Barton,et al. Chapter 18 Metamodel-Based Simulation Optimization , 2006, Simulation.
[4] Warrren B Powell,et al. Value Function Approximation using Multiple Aggregation for Multiattribute Resource Management , 2008 .
[5] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[6] Rémi Munos,et al. Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.
[7] David B. Dunson,et al. Bayesian Data Analysis , 2010 .
[8] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[9] E. Vázquez,et al. Convergence properties of the expected improvement algorithm with fixed mean and covariance functions , 2007, 0712.3744.
[10] John Shawe-Taylor,et al. Regret Bounds for Gaussian Process Bandit Problems , 2010, AISTATS 2010.
[11] F. H. Branin. Widely convergent method for finding multiple solutions of simultaneous nonlinear equations , 1972 .
[12] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[13] A. Tamhane. Design and Analysis of Experiments for Statistical Selection, Screening, and Multiple Comparisons , 1995 .
[14] D. Bertsekas,et al. Adaptive aggregation methods for infinite horizon dynamic programming , 1989 .
[15] Eric Walter,et al. An informational approach to the global optimization of expensive-to-evaluate functions , 2006, J. Glob. Optim..
[16] D. Lizotte. Practical bayesian optimization , 2008 .
[17] Jonas Mockus,et al. On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.
[18] Shie Mannor,et al. Action Elimination and Stopping Conditions for Reinforcement Learning , 2003, ICML.
[19] James C. Spall,et al. Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.
[20] R. Tibshirani,et al. Combining Estimates in Regression and Classification , 1996 .
[21] Robert D. Kleinberg,et al. Online decision problems with large strategy sets , 2005 .
[22] M. Degroot. Optimal Statistical Decisions , 1970 .
[23] Nick Littlestone,et al. From on-line to batch learning , 1989, COLT '89.
[24] Chun-Hung Chen,et al. A gradient approach for smartly allocating computing budget for discrete event simulation , 1996, Winter Simulation Conference.
[25] Warren B. Powell,et al. The Knowledge-Gradient Policy for Correlated Normal Beliefs , 2009, INFORMS J. Comput..
[26] Eli Upfal,et al. Multi-Armed Bandits in Metric Spaces ∗ , 2008 .
[27] Michael James Sasena,et al. Flexibility and efficiency enhancements for constrained global design optimization with kriging approximations. , 2002 .
[28] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.
[29] Yuhong Yang. Adaptive Regression by Mixing , 2001 .
[30] Warren B. Powell,et al. Optimal Learning , 2022, Encyclopedia of Machine Learning and Data Mining.
[31] H. Robbins. A Stochastic Approximation Method , 1951 .
[32] Warren B. Powell,et al. An Approximate Dynamic Programming Algorithm for Large-Scale Fleet Management: A Case Application , 2009, Transp. Sci..
[33] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[34] Csaba Szepesvári,et al. Online Optimization in X-Armed Bandits , 2008, NIPS.
[35] C. N Bouza,et al. Spall, J.C. Introduction to stochastic search and optimization. Estimation, simulation and control. Wiley Interscience Series in Discrete Mathematics and Optimization, 2003 , 2004 .
[36] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[37] Jürgen Branke,et al. Sequential Sampling to Myopically Maximize the Expected Value of Information , 2010, INFORMS J. Comput..
[38] Csaba Szepesvári,et al. Empirical Bernstein stopping , 2008, ICML '08.
[39] Howard Raiffa,et al. Applied Statistical Decision Theory. , 1961 .
[40] Donald R. Jones,et al. Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..
[41] Thomas P. Hayes,et al. High-Probability Regret Bounds for Bandit Online Linear Optimization , 2008, COLT.
[42] Elad Hazan,et al. Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization , 2008, COLT.
[43] James R. Evans,et al. Aggregation and Disaggregation Techniques and Methodology in Optimization , 1991, Oper. Res..
[44] D. Solomon,et al. Applied Statistical Decision Theory. , 1961 .
[45] Frank Hutter,et al. Automated configuration of algorithms for solving hard computational problems , 2009 .
[46] N. Zheng,et al. Global Optimization of Stochastic Black-Box Systems via Sequential Kriging Meta-Models , 2006, J. Glob. Optim..
[47] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[48] T. Lai. Adaptive treatment allocation and the multi-armed bandit problem , 1987 .
[49] Roel Bosker,et al. Multilevel analysis : an introduction to basic and advanced multilevel modeling , 1999 .
[50] Adam Tauman Kalai,et al. Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.
[51] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[52] S. Gupta,et al. Bayesian look ahead one-stage sampling allocations for selection of the best population , 1996 .
[53] Warren B. Powell,et al. A Knowledge-Gradient Policy for Sequential Information Collection , 2008, SIAM J. Control. Optim..
[54] Chun-Hung Chen,et al. Opportunity Cost and OCBA Selection Procedures in Ordinal Optimization for a Fixed Number of Alternative Systems , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[55] Nando de Freitas,et al. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.
[56] Stephen E. Chick,et al. New Two-Stage and Sequential Procedures for Selecting the Best Simulated System , 2001, Oper. Res..
[57] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[58] David Lindley,et al. Optimal Statistical Decisions , 1971 .
[59] Shai Shalev-Shwartz,et al. Online learning: theory, algorithms and applications (למידה מקוונת.) , 2007 .
[60] Russell Greiner,et al. The Budgeted Multi-armed Bandit Problem , 2004, COLT.