Exploitation-Oriented Learning PS-r#
暂无分享,去创建一个
[1] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[2] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[3] Shigenobu Kobayashi,et al. Reinforcement Learning for Penalty Avoidance in Continuous State Spaces , 2007, J. Adv. Comput. Intell. Intell. Informatics.
[4] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.
[5] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[6] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[7] Craig Boutilier,et al. Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.
[8] Andrew McCallum,et al. Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State , 1995, ICML.
[9] Theodore J. Perkins,et al. Reinforcement learning for POMDPs based on action values and stochastic optimization , 2002, AAAI/IAAI.
[10] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[11] Shigenobu Kobayashi,et al. Reinforcement Learning by Stochastic Hill Climbing on Discounted Reward , 1995, ICML.
[12] Douglas Aberdeen,et al. Scalable Internal-State Policy-Gradient Methods for POMDPs , 2002, ICML.
[13] Shigenobu Kobayashi,et al. Reinforcement learning for penalty avoiding policy making , 2000, Smc 2000 conference proceedings. 2000 ieee international conference on systems, man and cybernetics. 'cybernetics evolving to systems, humans, organizations, and their complex interactions' (cat. no.0.
[14] Shigenobu Kobayashi,et al. Reinforcement Learning in POMDPs with Function Approximation , 1997, ICML.
[15] Shigenobu Kobayashi,et al. An Extension of Profit Sharing to Partially Observable Markov Decision Processes: Proposition of PS-r* and its Evaluation , 2003 .
[16] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[17] Kathryn E. Merrick,et al. Motivated reinforcement learning for adaptive characters in open-ended simulation games , 2007, ACE '07.