In reinforcement learning, all objective functions are not equal
暂无分享,去创建一个
[1] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[2] Romain Laroche,et al. On Value Function Representation of Long Horizon Problems , 2018, AAAI.
[3] Alex M. Andrew,et al. ROBOT LEARNING, edited by Jonathan H. Connell and Sridhar Mahadevan, Kluwer, Boston, 1993/1997, xii+240 pp., ISBN 0-7923-9365-1 (Hardback, 218.00 Guilders, $120.00, £89.95). , 1999, Robotica (Cambridge. Print).
[4] Marek Petrik,et al. Biasing Approximate Dynamic Programming with a Lower Discount Factor , 2008, NIPS.
[5] Tom Schaul,et al. Natural Value Approximators: Learning when to Trust Past Estimates , 2017, NIPS.
[6] Romain Laroche,et al. Hybrid Reward Architecture for Reinforcement Learning , 2017, NIPS.
[7] Romain Laroche,et al. Multi-Advisor Reinforcement Learning , 2017, ArXiv.
[8] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[9] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.