A Method of Offline Reinforcement Learning Virtual Reality Satellite Attitude Control Based on Generative Adversarial Network
暂无分享,去创建一个
[1] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[2] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[3] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[4] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[5] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[6] Jorge L. Moiola,et al. Controlling an Inverted Pendulum with Bounded Controls , 2002 .
[7] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[8] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[9] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[10] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[11] Jimmy Ba,et al. Exploring Model-based Planning with Policy Networks , 2019, ICLR.
[12] Dennis S. Bernstein,et al. Asymptotic Smooth Stabilization of the Inverted 3-D Pendulum , 2009, IEEE Transactions on Automatic Control.
[13] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[14] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[15] Sergey Levine,et al. Learning to Walk via Deep Reinforcement Learning , 2018, Robotics: Science and Systems.
[16] Russ Tedrake,et al. Simulation-based LQR-trees with input and state constraints , 2010, 2010 IEEE International Conference on Robotics and Automation.
[17] Yuandong Tian,et al. Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees , 2018, ICLR.
[18] Sergey Levine,et al. When to Trust Your Model: Model-Based Policy Optimization , 2019, NeurIPS.