论文信息 - An Introduction to Reinforcement Learning - 字舞流文

An Introduction to Reinforcement Learning

Eduardo F. Morales | Julio H. Zaragoza | E. Morales

[1] Eduardo F. Morales,et al. Dynamic Reward Shaping: Training a Robot by Voice , 2010, IBERAMIA.

[2] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[3] Thomas G. Dietterich,et al. Reinforcement Learning Via Practice and Critique Advice , 2010, AAAI.

[4] Eduardo F. Morales,et al. Relational Reinforcement Learning with Continuous Actions by Combining Behavioural Cloning and Locally Weighted Regression , 2010, J. Intell. Learn. Syst. Appl..

[5] Peter Stone,et al. Combining manual feedback with subsequent MDP reward signals for reinforcement learning , 2010, AAMAS.

[6] Pieter Abbeel,et al. Parameterized maneuver learning for autonomous helicopter flight , 2010, 2010 IEEE International Conference on Robotics and Automation.

[7] Stefan Schaal,et al. A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[8] Daniel Kudenko,et al. Theoretical and Empirical Analysis of Reward Shaping in Reinforcement Learning , 2009, 2009 International Conference on Machine Learning and Applications.

[9] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[10] Abhijit Gosavi,et al. Reinforcement Learning: A Tutorial Survey and Recent Advances , 2009, INFORMS J. Comput..

[11] Bhaskara Marthi,et al. Automatic shaping and decomposition of reward functions , 2007, ICML '07.

[12] Alan Fern,et al. Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.

[13] Jude W. Shavlik,et al. Relational Macros for Transfer in Reinforcement Learning , 2007, ILP.

[14] Brett Browning,et al. Learning by demonstration with critique from a human teacher , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[15] Eyal Amir,et al. Bayesian Inverse Reinforcement Learning , 2007, IJCAI.

[16] Manuela M. Veloso,et al. Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.

[17] S. Mahadevan,et al. Proto-transfer Learning in Markov Decision Processes Using Spectral Methods , 2006 .

[18] Eduardo F. Morales,et al. Learning to fly by combining reinforcement learning with behavioural cloning , 2004, ICML.

[19] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[20] Gerald Tesauro,et al. Practical issues in temporal difference learning , 1992, Machine Learning.

[21] Richard S. Sutton,et al. Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.

[22] Dirk Ormoneit,et al. Kernel-Based Reinforcement Learning , 2004, Machine Learning.

[23] M. van Otterlo. Efficient Reinforcement Learning using Relational Aggregation , 2003 .

[24] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .

[25] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.

[26] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[27] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.

[28] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[29] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[30] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[31] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[32] Mark D. Pendrith,et al. RL-TOPS: An Architecture for Modularity and Re-Use in Reinforcement Learning , 1998, ICML.

[33] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.

[34] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[35] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.

[36] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[37] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[38] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[39] Hyongsuk Kim,et al. CMAC-based adaptive critic self-learning control , 1991, IEEE Trans. Neural Networks.

[40] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.

[41] Tomaso A. Poggio,et al. Extensions of a Theory of Networks for Approximation and Learning , 1990, NIPS.

[42] De,et al. Relational Reinforcement Learning , 2001, Encyclopedia of Machine Learning and Data Mining.