暂无分享,去创建一个
Siddhartha S. Srinivasa | Sanjiban Choudhury | Brian Hou | Gilwoo Lee | S. Srinivasa | Sanjiban Choudhury | Brian Hou | Gilwoo Lee
[1] Gireeja Ranade,et al. Data-driven planning via imitation learning , 2017, Int. J. Robotics Res..
[2] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[3] Zheng Wen,et al. Deep Exploration via Randomized Value Functions , 2017, J. Mach. Learn. Res..
[4] Yao Liu,et al. PAC Continuous State Online Multitask Reinforcement Learning with Identification , 2016, AAMAS.
[5] Benjamin Van Roy,et al. (More) Efficient Reinforcement Learning via Posterior Sampling , 2013, NIPS.
[6] Nan Rong,et al. What makes some POMDP problems easy to approximate? , 2007, NIPS.
[7] S. Shankar Sastry,et al. Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning , 2017, ArXiv.
[8] Jonathan Baxter. Theoretical Models of Learning to Learn , 2020 .
[9] Balaraman Ravindran,et al. EPOpt: Learning Robust Neural Network Policies Using Model Ensembles , 2016, ICLR.
[10] Sergey Levine,et al. PLATO: Policy learning using adaptive trajectory optimization , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[11] Sergey Levine,et al. Residual Reinforcement Learning for Robot Control , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[12] Angela P. Schoellig,et al. Learning-based nonlinear model predictive control to improve vision-based mobile robot path-tracking in challenging outdoor environments , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[13] S. Levine,et al. Guided Meta-Policy Search , 2019, NeurIPS.
[14] Greg Turk,et al. Preparing for the Unknown: Learning a Universal Policy with Online System Identification , 2017, Robotics: Science and Systems.
[15] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[16] Wojciech Zaremba,et al. Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[17] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..
[18] Thomas L. Griffiths,et al. Recasting Gradient-Based Meta-Learning as Hierarchical Bayes , 2018, ICLR.
[19] Min Chen,et al. POMDP-lite for robust robot planning under uncertainty , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[20] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[21] David Hsu,et al. SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.
[22] Guy Shani,et al. Noname manuscript No. (will be inserted by the editor) A Survey of Point-Based POMDP Solvers , 2022 .
[23] Angela P. Schoellig,et al. Safe and robust learning control with Gaussian processes , 2015, 2015 European Control Conference (ECC).
[24] David Hsu,et al. LeTS-Drive: Driving in a Crowd by Learning from Tree Search , 2019, Robotics: Science and Systems.
[25] Mykel J. Kochenderfer,et al. Online Algorithms for POMDPs with Continuous State, Action, and Observation Spaces , 2017, ICAPS.
[26] Marcin Andrychowicz,et al. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[27] Yoav Freund,et al. A Short Introduction to Boosting , 1999 .
[28] Sergey Levine,et al. Probabilistic Model-Agnostic Meta-Learning , 2018, NeurIPS.
[29] Siddhartha S. Srinivasa,et al. MuSHR: A Low-Cost, Open-Source Robotic Racecar for Education and Research , 2019, ArXiv.
[30] Scott Sanner,et al. Reinforcement Learning with Multiple Experts: A Bayesian Model Combination Approach , 2018, NeurIPS.
[31] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[32] Jonathan Baxter,et al. Theoretical Models of Learning to Learn , 1998, Learning to Learn.
[33] Sergey Levine,et al. Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.
[34] Shie Mannor,et al. Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..
[35] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[36] Leslie Pack Kaelbling,et al. Residual Policy Learning , 2018, ArXiv.
[37] Siddhartha S. Srinivasa,et al. Bayesian Policy Optimization for Model Uncertainty , 2018, ICLR.
[38] Yoshua Bengio,et al. Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.
[39] Angela P. Schoellig,et al. Conservative to confident: Treating uncertainty robustly within Learning-Based Control , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[40] Pieter Abbeel,et al. Some Considerations on Learning to Explore via Meta-Reinforcement Learning , 2018, ICLR 2018.
[41] Yee Whye Teh,et al. Meta-learning of Sequential Strategies , 2019, ArXiv.
[42] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[43] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[44] Peter Dayan,et al. Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search , 2012, NIPS.
[45] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[46] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[47] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[48] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[49] Neil C. Rabinowitz. Meta-learners' learning dynamics are unlike learners' , 2019, ArXiv.
[50] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.