暂无分享,去创建一个
[1] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[2] Charles W. Anderson,et al. Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) , 1986 .
[3] Kilian Q. Weinberger,et al. Web-Search Ranking with Initialized Gradient Boosted Regression Trees , 2010, Yahoo! Learning to Rank Challenge.
[4] Stefanie Tellex,et al. Goal-Based Action Priors , 2015, ICAPS.
[5] John Langford,et al. Exploration in Metric State Spaces , 2003, ICML.
[6] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[7] Stefanie Tellex,et al. Minecraft as an Experimental World for AI in Robotics , 2015, AAAI Fall Symposia.
[8] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[9] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[10] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[12] R. Lathe. Phd by thesis , 1988, Nature.
[13] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[14] Sergey Levine,et al. Model-based reinforcement learning with parametrized physical models and optimism-driven exploration , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[15] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .
[16] Peter Dayan,et al. Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search , 2012, NIPS.
[17] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[18] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[19] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[20] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[21] Peter L. Bartlett,et al. Functional Gradient Techniques for Combining Hypotheses , 2000 .
[22] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[23] Luc Van Gool,et al. Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..
[24] Gilles Louppe,et al. Independent consultant , 2013 .
[25] D. B. Davis,et al. Intel Corp. , 1993 .
[26] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[27] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[28] Michael L. Littman,et al. An Ensemble of Linearly Combined Reinforcement-Learning Agents , 2013, AAAI.