Constructive Policy: Reinforcement Learning Approach for Connected Multi-Agent Systems
暂无分享,去创建一个
[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[2] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[3] Stefan Schaal,et al. Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.
[4] Nikolaos G. Tsagarakis,et al. Bipedal walking energy minimization by reinforcement learning with evolving policy parameterization , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[5] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.
[6] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[7] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[8] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[9] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[10] Pieter Abbeel,et al. Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion , 2007, NIPS.
[11] Konrad Paul Kording,et al. Review TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Bayesian decision theory in sensorimotor control , 2022 .
[12] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[13] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[14] Nando de Freitas,et al. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.
[15] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[16] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[17] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[18] K. Gürsoy,et al. SEMI-MARKOV DECISION PROCESSES , 2007, Probability in the Engineering and Informational Sciences.
[19] Sergey Levine,et al. Towards Adapting Deep Visuomotor Representations from Simulated to Real Environments , 2015, ArXiv.
[20] Warren B. Powell,et al. “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.
[21] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[22] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[23] Warren B. Powell,et al. Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .
[24] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[25] Mingjie Lin,et al. Policy Reuse in Reinforcement Learning for Modular Agents , 2019, 2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT).
[26] Maozhen Li,et al. Development of a Local Prosthetic Limb Using Artificial Intelligence , 2016 .
[27] Antoine Cully,et al. Robots that can adapt like animals , 2014, Nature.
[28] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[29] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[30] Daniel Dewey,et al. Reinforcement Learning and the Reward Engineering Principle , 2014, AAAI Spring Symposia.
[31] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[32] Sergey Levine,et al. Composable Deep Reinforcement Learning for Robotic Manipulation , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).