Mixing Habits and Planning for Multi-Step Target Reaching Using Arbitrated Predictive Actor-Critic
暂无分享,去创建一个
[1] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[2] Mitsuo Kawato,et al. Internal models for motor control and trajectory planning , 1999, Current Opinion in Neurobiology.
[3] M. Kawato,et al. A hierarchical neural-network model for control and learning of voluntary movement , 2004, Biological Cybernetics.
[4] Michael T. Rosenstein,et al. Supervised Actor‐Critic Reinforcement Learning , 2012 .
[5] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[6] Sang Wan Lee,et al. The structure of reinforcement-learning mechanisms in the human brain , 2015, Current Opinion in Behavioral Sciences.
[7] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[8] N. Daw,et al. Multiple Systems for Value Learning , 2014 .
[9] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[10] Thomas P. Trappenberg,et al. A Novel Model for Arbitration Between Planning and Habitual Control Systems , 2017, Front. Neurorobot..
[11] M. Rosenstein,et al. Supervised Learning Combined with an Actor-Critic Architecture TITLE2: , 2002 .
[12] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[13] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[14] Scott T. Grafton,et al. Forward modeling allows feedback control for fast reaching movements , 2000, Trends in Cognitive Sciences.
[15] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.
[16] M. Jeannerod,et al. Constraints on human arm movement trajectories. , 1987, Canadian journal of psychology.
[17] P. Dayan,et al. Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.
[18] Shinsuke Shimojo,et al. Neural Computations Underlying Arbitration between Model-Based and Model-free Learning , 2013, Neuron.
[19] Michael I. Jordan,et al. An internal model for sensorimotor integration. , 1995, Science.
[20] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[21] D. Wolpert,et al. Internal models in the cerebellum , 1998, Trends in Cognitive Sciences.
[22] A. Barto,et al. 1 Supervised Actor-Critic Reinforcement Learning , 2007 .
[23] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[24] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[25] Michael I. Jordan. Computational aspects of motor control and motor learning , 2008 .
[26] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[27] G. Uhlenbeck,et al. On the Theory of the Brownian Motion , 1930 .