Boosted Bellman Residual Minimization Handling Expert Demonstrations
暂无分享,去创建一个
[1] N. Aronszajn. Theory of Reproducing Kernels. , 1950 .
[2] F. Clarke. Generalized gradients and applications , 1975 .
[3] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[4] Bin Yu. RATES OF CONVERGENCE FOR EMPIRICAL PROCESSES OF STATIONARY MIXING SEQUENCES , 1994 .
[5] K. I. M. McKinnon,et al. On the Generation of Markov Decision Processes , 1995 .
[6] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[7] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[8] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[9] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[10] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[11] Csaba Szepesvári,et al. Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path , 2006, COLT.
[12] Csaba Szepesvári,et al. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path , 2006, Machine Learning.
[13] Siddhartha S. Srinivasa,et al. Imitation learning for locomotion and manipulation , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.
[14] Rémi Munos,et al. Performance Bounds in Lp-norm for Approximate Value Iteration , 2007, SIAM J. Control. Optim..
[15] Michael H. Bowling,et al. Apprenticeship learning using linear programming , 2008, ICML '08.
[16] Peter A. Flach,et al. Evaluation Measures for Multi-class Subgroup Discovery , 2009, ECML/PKDD.
[17] Csaba Szepesvári,et al. Error Propagation for Approximate Policy and Value Iteration , 2010, NIPS.
[18] Bernhard Schölkopf,et al. Hilbert Space Embeddings and Metrics on Probability Measures , 2009, J. Mach. Learn. Res..
[19] J. Andrew Bagnell,et al. Generalized Boosting Algorithms for Convex Optimization , 2011, ICML.
[20] Wei-Yin Loh,et al. Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..
[21] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[22] Matthieu Geist,et al. Inverse Reinforcement Learning through Structured Classification , 2012, NIPS.
[23] Thomas G. Dietterich,et al. Active Imitation Learning via Reduction to I.I.D. Active Learning , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.
[24] Guy Lever,et al. Modelling transition dynamics in MDPs with RKHS embeddings , 2012, ICML.
[25] Joelle Pineau,et al. Learning from Limited Demonstrations , 2013, NIPS.
[26] Matthieu Geist,et al. Learning from Demonstrations: Is It Worth Estimating a Reward Function? , 2013, ECML/PKDD.
[27] Franziska Wulf,et al. Minimization Methods For Non Differentiable Functions , 2016 .