A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
暂无分享,去创建一个
[1] Stefan Schaal,et al. Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.
[2] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[3] Ben Taskar,et al. Max-Margin Markov Networks , 2003, NIPS.
[4] Claudio Gentile,et al. On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.
[5] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[6] Thomas P. Hayes,et al. Error limiting reductions between classification tasks , 2005, ICML.
[7] David M. Bradley,et al. Boosting Structured Prediction for Imitation Learning , 2006, NIPS.
[8] Nathan Ratliff,et al. Online) Subgradient Methods for Structured Prediction , 2007 .
[9] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.
[10] David Silver,et al. High Performance Outdoor Navigation from Overhead Data using Imitation Learning , 2008, Robotics: Science and Systems.
[11] Ambuj Tewari,et al. On the Generalization Ability of Online Strongly Convex Programming Algorithms , 2008, NIPS.
[12] Nathan Srebro,et al. Fast Rates for Regularized Objectives , 2008, NIPS.
[13] Sham M. Kakade,et al. Mind the Duality Gap: Logarithmic regret algorithms for online optimization , 2008, NIPS.
[14] John Langford,et al. Search-based structured prediction , 2009, Machine Learning.
[15] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[16] Manuela M. Veloso,et al. Interactive Policy Learning through Confidence-Based Autonomy , 2014, J. Artif. Intell. Res..
[17] J. Andrew Bagnell,et al. Efficient Reductions for Imitation Learning , 2010, AISTATS.