Batch, Off-Policy and Model-Free Apprenticeship Learning
暂无分享,去创建一个
[1] Pieter Abbeel,et al. Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion , 2007, NIPS.
[2] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[3] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[4] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[5] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[6] Csaba Szepesvári,et al. Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods , 2007, UAI.
[7] K. Fernow. New York , 1896, American Potato Journal.
[8] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[9] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[10] Eyal Amir,et al. Bayesian Inverse Reinforcement Learning , 2007, IJCAI.
[11] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[12] Stuart J. Russell. Learning agents for uncertain environments (extended abstract) , 1998, COLT' 98.
[13] Dimitri P. Bertsekas,et al. Least Squares Policy Evaluation Algorithms with Linear Function Approximation , 2003, Discret. Event Dyn. Syst..
[14] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.
[15] Michael H. Bowling,et al. Apprenticeship learning using linear programming , 2008, ICML '08.
[16] Siddhartha S. Srinivasa,et al. Imitation learning for locomotion and manipulation , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.
[17] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[18] David M. Bradley,et al. Boosting Structured Prediction for Imitation Learning , 2006, NIPS.
[19] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[20] Alessandro Lazaric,et al. Finite-Sample Analysis of LSTD , 2010, ICML.
[21] Andrew G. Barto,et al. Reinforcement learning , 1998 .