论文信息 - A Bayesian Approach to Imitation in Reinforcement Learning

A Bayesian Approach to Imitation in Reinforcement Learning

In multiagent environments, forms of social learning such as teaching and imitation have been shown to aid the transfer of knowledge from experts to learners in reinforcement learning (RL). We recast the problem of imitation in a Bayesian framework. Our Bayesian imitation model allows a learner to smoothly pool prior knowledge, data obtained through interaction with the environment, and information inferred from observations of expert agent behaviors. Our model integrates well with recent Bayesian exploration techniques, and can be readily generalized to new settings.

Craig Boutilier | Bob Price | Craig Boutilier | B. Price | Bob Price

[1] Claude Sammut,et al. Learning to Fly , 1992, ML.

[2] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .

[3] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[4] Masayuki Inaba,et al. Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..

[5] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.

[6] Michael Kearns,et al. Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms , 1998, NIPS.

[7] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[8] David Andre,et al. Model based Bayesian Exploration , 1999, UAI.

[9] Craig Boutilier,et al. Implicit Imitation in Multiagent Reinforcement Learning , 1999, ICML.

[10] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[11] Craig Boutilier,et al. Imitation and Reinforcement Learning in Agents with Heterogeneous Actions , 2001, Canadian Conference on AI.

[12] Maja J. Matarić,et al. Sensory-motor primitives as a basis for imitation: linking perception to action and biology to robotics , 2002 .

[13] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.

[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.