Value Iteration in Continuous Actions, States and Time
暂无分享,去创建一个
Shie Mannor | Dieter Fox | Jan Peters | Michael Lutter | Animesh Garg | D. Fox | Jan Peters | Shie Mannor | M. Lutter | Animesh Garg
[1] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[2] Silvio Savarese,et al. Adversarially Robust Policy Learning through Active Construction of Physically-Plausible Perturbations , 2017 .
[3] Sicun Gao,et al. Neural Lyapunov Control , 2020, NeurIPS.
[4] Andreas Krause,et al. The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems , 2018, CoRL.
[5] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[6] Andrea Bonarini,et al. MushroomRL: Simplifying Reinforcement Learning Research , 2020, J. Mach. Learn. Res..
[7] John N. Tsitsiklis,et al. Feature-based methods for large scale dynamic programming , 2004, Machine Learning.
[8] Paris Perdikaris,et al. Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations , 2017, ArXiv.
[9] Jan Peters,et al. HJB Optimal Feedback Control with Deep Differential Value Functions and Action Constraints , 2019, CoRL.
[10] Xingye Da,et al. Dynamics Randomization Revisited: A Case Study for Quadrupedal Locomotion , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[11] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[12] S. Lyshevski. Optimal control of nonlinear continuous-time systems: design of bounded controllers via generalized nonquadratic functionals , 1998, Proceedings of the 1998 American Control Conference. ACC (IEEE Cat. No.98CH36207).
[13] Sergey Levine,et al. Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning , 2018, ArXiv.
[14] Honglak Lee,et al. Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion , 2018, NeurIPS.
[15] J. Zico Kolter,et al. Learning Stable Deep Dynamics Models , 2020, NeurIPS.
[16] Gerald Tesauro,et al. Practical issues in temporal difference learning , 1992, Machine Learning.
[17] Philipp Hennig,et al. Optimal Reinforcement Learning for Gaussian Systems , 2011, NIPS.
[18] Christian Stöcker,et al. Event-Based Control , 2014 .
[19] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[20] Jeongho Kim,et al. Hamilton-Jacobi Deep Q-Learning for Deterministic Continuous-Time Systems with Lipschitz Continuous Controls , 2020, J. Mach. Learn. Res..
[21] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[22] Sergio Gomez Colmenarejo,et al. RL Unplugged: Benchmarks for Offline Reinforcement Learning , 2020, ArXiv.
[23] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[24] Stefan Schaal,et al. Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.
[25] Byron Boots,et al. Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion , 2020, Conference on Robot Learning.
[26] Donald E. Kirk,et al. Optimal control theory : an introduction , 1970 .
[27] Jan Peters,et al. Domain Randomization for Simulation-Based Policy Optimization with Transferability Assessment , 2018, CoRL.
[28] Karthikeyan Rajagopal,et al. Neural Network-Based Solutions for Stochastic Optimal Control Using Path Integrals , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[29] Csaba Szepesvári,et al. Finite-Time Bounds for Fitted Value Iteration , 2008, J. Mach. Learn. Res..
[30] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[31] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[32] Sergey Levine,et al. Conservative Safety Critics for Exploration , 2021, ICLR.
[33] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[34] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[35] Evangelos A. Theodorou,et al. Safe Optimal Control Using Stochastic Barrier Functions and Deep Forward-Backward SDEs , 2020, ArXiv.
[36] H. Kappen. Linear theory for control of nonlinear stochastic systems. , 2004, Physical review letters.
[37] George Em Karniadakis,et al. Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations , 2020, Science.
[38] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[39] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[40] Shie Mannor,et al. Regularized Fitted Q-Iteration for planning in continuous-space Markovian decision problems , 2009, 2009 American Control Conference.
[41] Andreas Krause,et al. Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.
[42] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[43] Paris Perdikaris,et al. Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations , 2017, ArXiv.
[44] Dieter Fox,et al. BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators , 2019, Robotics: Science and Systems.
[45] P. Olver. Nonlinear Systems , 2013 .
[46] K. Jarrod Millman,et al. Array programming with NumPy , 2020, Nat..
[47] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[48] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[49] Yevgen Chebotar,et al. Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[50] Yuval Tassa,et al. Least Squares Solutions of the HJB Equation With Neural Network Value-Function Approximators , 2007, IEEE Transactions on Neural Networks.
[51] Evangelos A. Theodorou,et al. Learning Deep Stochastic Optimal Control Policies Using Forward-Backward SDEs , 2019, Robotics: Science and Systems.
[52] Silvio Savarese,et al. ADAPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems , 2017, ISRR.
[53] Yunpeng Pan,et al. Model-based Path Integral Stochastic Control: A Bayesian Nonparametric Approach , 2014, ArXiv.
[54] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[55] Xiong Yang,et al. Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints , 2014, Int. J. Control.
[56] Emanuel Todorov,et al. Linearly-solvable Markov decision problems , 2006, NIPS.