暂无分享,去创建一个
[1] Zoran Popovic,et al. Discovery of complex behaviors through contact-invariant optimization , 2012, ACM Trans. Graph..
[2] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[3] George Tucker,et al. Conservative Q-Learning for Offline Reinforcement Learning , 2020, NeurIPS.
[4] Glen Berseth,et al. DeepLoco , 2017, ACM Trans. Graph..
[5] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[6] Peter Stone,et al. Intrinsically motivated model learning for developing curious robots , 2017, Artif. Intell..
[7] Jaakko Lehtinen,et al. PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation , 2018, 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP).
[8] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[9] Jungdam Won,et al. A scalable approach to control diverse behaviors for physically simulated characters , 2020, ACM Trans. Graph..
[10] Glen Berseth,et al. Dynamic terrain traversal skills using reinforcement learning , 2015, ACM Trans. Graph..
[11] Philippe Beaudoin,et al. Continuation methods for adapting simulated skills , 2008, ACM Trans. Graph..
[12] Sebastian Thrun,et al. Probabilistic robotics , 2002, CACM.
[13] Alexander Ilin,et al. Regularizing Model-Based Planning with Energy-Based Models , 2019, CoRL.
[14] Michiel van de Panne,et al. Learning locomotion skills using DeepRL: does the choice of action space matter? , 2016, Symposium on Computer Animation.
[15] Marwan Mattar,et al. Unity: A General Platform for Intelligent Agents , 2018, ArXiv.
[16] Sergey Levine,et al. Model-based reinforcement learning with parametrized physical models and optimism-driven exploration , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[17] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[18] Kourosh Naderi,et al. Discovering and synthesizing humanoid climbing movements , 2017, ACM Trans. Graph..
[19] Yuval Tassa,et al. Deep neuroethology of a virtual rodent , 2019, ICLR.
[20] Daniel Holden,et al. DReCon , 2019, ACM Trans. Graph..
[21] Kourosh Naderi,et al. Learning Physically Based Humanoid Climbing Movements , 2018, Comput. Graph. Forum.
[22] Hao Li,et al. Visualizing the Loss Landscape of Neural Nets , 2017, NeurIPS.
[23] Sergey Levine,et al. Learning Predictive Models From Observation and Interaction , 2019, ECCV.
[24] Yuval Tassa,et al. Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[25] Sergey Levine,et al. Near-Optimal Representation Learning for Hierarchical Reinforcement Learning , 2018, ICLR.
[26] Glen Berseth,et al. Terrain-adaptive locomotion skills using deep reinforcement learning , 2016, ACM Trans. Graph..
[27] Jie Tan,et al. Learning Agile Robotic Locomotion Skills by Imitating Animals , 2020, RSS 2020.
[28] Baining Guo,et al. Terrain runner , 2012, ACM Trans. Graph..
[29] Laurent Orseau,et al. Universal Knowledge-Seeking Agents for Stochastic Environments , 2013, ALT.
[30] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[31] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[32] Nikolaus Hansen,et al. The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.
[33] Abhishek Gupta,et al. Learning to Reach Goals via Iterated Supervised Learning. , 2019 .
[34] Kourosh Naderi,et al. Intelligent Middle-Level Game Control , 2018, 2018 IEEE Conference on Computational Intelligence and Games (CIG).
[35] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[36] Sergey Levine,et al. MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies , 2019, NeurIPS.
[37] Kourosh Naderi,et al. Self-Imitation Learning of Locomotion Movements through Termination Curriculum , 2019, MIG.
[38] Perttu Hämäläinen,et al. Deep Residual Mixture Models , 2020, ArXiv.
[39] KangKang Yin,et al. SIMBICON: simple biped locomotion control , 2007, ACM Trans. Graph..
[40] C. Karen Liu,et al. Visualizing Movement Control Optimization Landscapes , 2019, IEEE Transactions on Visualization and Computer Graphics.
[41] Pieter Abbeel,et al. Planning to Explore via Self-Supervised World Models , 2020, ICML.
[42] Sergey Levine,et al. Unsupervised Meta-Learning for Reinforcement Learning , 2018, ArXiv.
[43] Michiel van de Panne,et al. Flexible muscle-based locomotion for bipedal creatures , 2013, ACM Trans. Graph..
[44] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[45] M.A. Wiering,et al. Reinforcement Learning in Continuous Action Spaces , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[46] Jaakko Lehtinen,et al. Online motion synthesis using sequential Monte Carlo , 2014, ACM Trans. Graph..
[47] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[48] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[49] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[50] Mohammad Norouzi,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.
[51] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[52] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[53] James Davidson,et al. TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow , 2017, ArXiv.
[54] Sergey Levine,et al. DeepMimic , 2018, ACM Trans. Graph..
[55] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[56] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[57] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..