论文信息 - A social reinforcement learning agent

A social reinforcement learning agent

We report on our reinforcement learning work on Cobot, a software agent that resides in the well-known online chat community LambdaMOO. Our initial work on Cobot~\cite{cobotaaai} provided him with the ability to collect {\em social statistics\/} and report them to users in a reactive manner. Here we describe our application of reinforcement learning to allow Cobot to proactively take actions in this complex social environment, and adapt his behavior from multiple sources of human reward. After 5 months of training, Cobot received 3171 reward and punishment events from 254 different Lambda\-MOO users, and learned nontrivial preferences for a number of users. Cobot modifies his behavior based on his current state in an attempt to maximize reward. Here we describe LambdaMOO and the state and action spaces of Cobot, and report the statistical results of the learning experiment.

[1] Michael L. Mauldin,et al. CHATTERBOTS, TINYMUDS, and the Turing Test: Entering the Loebner Prize Competition , 1994, AAAI.

[2] Leonard N. Foner,et al. Entertaining agents: a sociological case study , 1997, AGENTS '97.

[3] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[4] Manuela M. Veloso,et al. Team-partitioned, opaque-transition reinforcement learning , 1999, AGENTS '99.

[5] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[6] Christian R. Shelton,et al. Balancing Multiple Sources of Reward in Reinforcement Learning , 2000, NIPS.

[7] Peter Stone,et al. Cobot in LambdaMOO: A Social Statistics Agent , 2000, AAAI/IAAI.

[8] Marilyn A. Walker,et al. Empirical Evaluation of a Reinforcement Learning Spoken Dialogue System , 2000, AAAI/IAAI.