暂无分享,去创建一个
Igor Gilitschenski | Martin A. Riedmiller | Daniela Rus | Wilko Schwarting | Martin Riedmiller | Markus Wulfmeier | Tim Seyde | Bartolomeo Stellato | Markus Wulfmeier | D. Rus | Wilko Schwarting | B. Stellato | Igor Gilitschenski | T. Seyde
[1] Twan Koolen,et al. Team IHMC's Lessons Learned from the DARPA Robotics Challenge Trials , 2015, J. Field Robotics.
[2] Bao-Zhu Guo,et al. The Bang–Bang Property of Time-Varying Optimal Time Control for Null Controllable Heat Equation , 2019, J. Optim. Theory Appl..
[3] Martin D. Levine,et al. A two-stage learning control system , 1970 .
[4] Raia Hadsell,et al. Value constrained model-free continuous control , 2019, ArXiv.
[5] J P Lasalle,et al. TIME OPTIMAL CONTROL SYSTEMS. , 1959, Proceedings of the National Academy of Sciences of the United States of America.
[6] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[7] Nir Levine,et al. An empirical investigation of the challenges of real-world reinforcement learning , 2020, ArXiv.
[8] P. Goulart,et al. High-Speed Finite Control Set Model Predictive Control for Power Electronics , 2015, IEEE Transactions on Power Electronics.
[9] Emanuel Joos,et al. Reinforcement Learning of Musculoskeletal Control from Functional Simulations , 2020, MICCAI.
[10] Shimon Whiteson,et al. Growing Action Spaces , 2019, ICML.
[11] Igor Kluvánek,et al. The bang-bang principle , 1978 .
[12] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[13] Yunhao Tang,et al. Discretizing Continuous Action Space for On-Policy Optimization , 2019, AAAI.
[14] Wilko Schwarting,et al. Learning to Plan Optimistically: Uncertainty-Guided Deep Exploration via Latent Model Ensembles , 2020, ArXiv.
[15] Benjamin J. Hodel. Learning to Operate an Excavator via Policy Optimization , 2018 .
[16] Martin A. Riedmiller,et al. Compositional Transfer in Hierarchical Reinforcement Learning , 2019, Robotics: Science and Systems.
[17] Sebastian Scherer,et al. Improving Stochastic Policy Gradients in Continuous Control with Deep Reinforcement Learning using the Beta Distribution , 2017, ICML.
[18] L. M. Sonneborn,et al. The Bang-Bang Principle for Linear Control Systems , 1964 .
[19] Mohammad Norouzi,et al. Mastering Atari with Discrete World Models , 2020, ICLR.
[20] Yunhao Tang,et al. Discrete Action On-Policy Learning with Action-Value Critic , 2020, AISTATS.
[21] Arnaud Münch,et al. Numerical approximation of bang-bang controls for the heat equation: An optimal design approach , 2013, Syst. Control. Lett..
[22] Sebastien Gros,et al. MPC-based Reinforcement Learning for Economic Problems with Application to Battery Storage , 2021, 2021 European Control Conference (ECC).
[23] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[24] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[25] Sergio Gomez Colmenarejo,et al. Acme: A Research Framework for Distributed Reinforcement Learning , 2020, ArXiv.
[26] R. Bellman,et al. On the “bang-bang” control problem , 1956 .
[27] Cecilia Laschi,et al. Model-Based Reinforcement Learning for Closed-Loop Dynamic Control of Soft Robotic Manipulators , 2019, IEEE Transactions on Robotics.
[28] Yuval Tassa,et al. Relative Entropy Regularized Policy Iteration , 2018, ArXiv.
[29] Gerd Wachsmuth,et al. Second-Order Analysis and Numerical Approximation for Bang-Bang Bilinear Control Problems , 2017, SIAM J. Control. Optim..
[30] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[31] Ning Chen,et al. Time-varying bang-bang property of time optimal controls for heat equation and its application , 2018, Syst. Control. Lett..
[32] Gerd Wachsmuth,et al. Sufficient Second-Order Conditions for Bang-Bang Control Problems , 2017, SIAM J. Control. Optim..
[33] L. A. Manita. Optimal operating modes with chattering switching in manipulator control problems , 2000 .
[34] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[35] P. Schrimpf,et al. Dynamic Programming , 2011 .
[36] K. Fu,et al. A heuristic approach to reinforcement learning control systems , 1965 .
[37] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[38] Arash Tavakoli,et al. Action Branching Architectures for Deep Reinforcement Learning , 2017, AAAI.
[39] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[40] Sina Ober-Blöbaum,et al. Second-Order Switching Time Optimization for Switched Dynamical Systems , 2016, IEEE Transactions on Automatic Control.
[41] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[42] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[43] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[44] Che Wang,et al. Striving for Simplicity and Performance in Off-Policy DRL: Output Normalization and Non-Uniform Sampling , 2020, ICML.
[45] S. Sastry,et al. Zeno hybrid systems , 2001 .
[46] Karl Kunisch,et al. The bang-bang property of time optimal controls for the Burgers equation , 2014 .
[47] Fabio Pardo,et al. Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking , 2020, ArXiv.
[48] Marius Tucsnak,et al. Maximum Principle and Bang-Bang Property of Time Optimal Controls for Schrödinger-Type Systems , 2013, SIAM J. Control. Optim..
[49] H. Maurer,et al. Optimization methods for the verification of second order sufficient conditions for bang–bang controls , 2005 .
[50] Charles W. Anderson,et al. Learning to Control an Inverted Pendulum with Connectionist Networks , 1988, 1988 American Control Conference.
[51] Donald E. Kirk,et al. Optimal control theory : an introduction , 1970 .
[52] Petros Koumoutsakos,et al. Remember and Forget for Experience Replay , 2018, ICML.
[53] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[54] Sandy H. Huang,et al. Learning Gentle Object Manipulation with Curiosity-Driven Deep Reinforcement Learning , 2019, ArXiv.
[55] Lorenz Wellhausen,et al. Learning quadrupedal locomotion over challenging terrain , 2020, Science Robotics.
[56] Yuval Tassa,et al. dm_control: Software and Tasks for Continuous Control , 2020, Softw. Impacts.
[57] Yubiao Zhang,et al. Decompositions and bang-bang properties , 2016, 1603.05362.
[58] Sergey M. Plis,et al. Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments , 2018, ArXiv.