A Policy Representation Using Weighted Multiple Normal Distribution
暂无分享,去创建一个
Shigenobu Kobayashi | Hajime Kimura | Takeshi Aramaki | H. Kimura | Shigenobu Kobayashi | Takeshi Aramaki | S. Kobayashi
[1] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[2] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[3] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[4] Shigenobu Kobayashi,et al. An Actor-Critic Algorithm Using a Binary Tree Action Selector , 2001 .
[5] Jun Morimoto,et al. Acquisition of Stand-up Behavior by a 3-link 2-joint Robot using Hierarchical Reinforcement Learning , 2001 .
[6] Mitsuo Kawato,et al. MOSAIC Reinforcement Learning Architecture: Symbolization by Predictability and Mimic Learning by Symbol , 2001 .
[7] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[8] Osamu Katai,et al. Fuzzy Interpolation-Based Q-Learning with Continuous Inputs and Outputs , 1999 .
[9] Ashwin Ram,et al. Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..
[10] Cheng-Jian Lin,et al. An ART-based fuzzy adaptive learning control network , 1994, NAFIPS/IFIS/NASA '94. Proceedings of the First International Joint Conference of The North American Fuzzy Information Processing Society Biannual Conference. The Industrial Fuzzy Control and Intellige.