Efficient Exploration in Monte Carlo Tree Search using Human Action Abstractions
暂无分享,去创建一个
[1] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[2] Sridhar Mahadevan,et al. Learning to Take Concurrent Actions , 2002, NIPS.
[3] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[4] Arya Irani,et al. Utilizing negative policy information to accelerate reinforcement learning , 2015 .
[5] K. Subramanian,et al. Learning Options through Human Interaction , 2011 .
[6] Marco Colombetti,et al. Robot Shaping: Developing Autonomous Agents Through Learning , 1994, Artif. Intell..
[7] Scott Kuindersma,et al. Constructing Skill Trees for Reinforcement Learning Agents from Demonstration Trajectories , 2010, NIPS.
[8] Lihong Li,et al. PAC-inspired Option Discovery in Lifelong Reinforcement Learning , 2014, ICML.
[9] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[10] Sylvain Gelly,et al. Exploration exploitation in Go: UCT for Monte-Carlo Go , 2006, NIPS 2006.
[11] Maya Cakmak,et al. Optimality of human teachers for robot learners , 2010, 2010 IEEE 9th International Conference on Development and Learning.
[12] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[13] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[14] Sridhar Mahadevan,et al. Decision-Theoretic Planning with Concurrent Temporally Extended Actions , 2001, UAI.
[15] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[16] Thomas J. Walsh,et al. Integrating Sample-Based Planning and Model-Based Reinforcement Learning , 2010, AAAI.
[17] José María Valls,et al. Correcting and improving imitation models of humans for Robosoccer agents , 2005, 2005 IEEE Congress on Evolutionary Computation.
[18] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[19] Daniel H. Grollman,et al. Dogged Learning for Robots , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.
[20] Peng Zhou,et al. Discovering options from example trajectories , 2009, ICML '09.
[21] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[22] Yngvi Björnsson,et al. Simulation-Based Approach to General Game Playing , 2008, AAAI.
[23] Stuart J. Russell,et al. Markovian State and Action Abstractions for MDPs via Hierarchical MCTS , 2016, IJCAI.
[24] David A. Bell,et al. Skill Combination for Reinforcement Learning , 2007, IDEAL.
[25] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[26] Karen M. Feigh,et al. Application of Abstraction Hierarchies to Incorporate Human Knowledge for Machine Learning , 2015 .
[27] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[28] Karen M. Feigh,et al. Discovery, Evaluation, and Exploration of Human Supplied Options and Constraints , 2015, AAMAS.
[29] Shie Mannor,et al. Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.
[30] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[31] Olivier Teytaud,et al. Adding Expert Knowledge and Exploration in Monte-Carlo Tree Search , 2009, ACG.