The scalarized multi-objective multi-armed bandit problem: An empirical study of its exploration vs. exploitation tradeoff
暂无分享,去创建一个
[1] Kaisa Miettinen,et al. Nonlinear multiobjective optimization , 1998, International series in operations research and management science.
[2] Alex M. Andrew,et al. ROBOT LEARNING, edited by Jonathan H. Connell and Sridhar Mahadevan, Kluwer, Boston, 1993/1997, xii+240 pp., ISBN 0-7923-9365-1 (Hardback, 218.00 Guilders, $120.00, £89.95). , 1999, Robotica (Cambridge. Print).
[3] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[4] Warren B. Powell,et al. Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .
[5] Gabriele Eichfelder,et al. Adaptive Scalarization Methods in Multiobjective Optimization , 2008, Vector Optimization.
[6] Ann Nowé,et al. Designing multi-objective multi-armed bandits algorithms: A study , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).
[7] Bernard Manderick,et al. Knowledge Gradient for Multi-objective Multi-armed Bandit Algorithms , 2014, ICAART.