论文信息 - Inferring the Intentions of Learning Agents

Inferring the Intentions of Learning Agents

This thesis addresses the problem of inferring the goals, represented as a utility function, of an intelligent agent who is learning in a known way. It generalizes the standard inverse reinforcement learning (IRL) problem, where it is assumed that the agent does not (need to) learn and makes decisions optimally. Here, it is assumed instead that the agent may learn in a known way and makes decisions in a known, albeit not necessarily optimal, way. In addition to laying out the general problem, I identify two of its special cases and solve them with polynomial-time algorithms. One formulates the problem as a linear program, and the other formulates the problem as one of maximum a posteriori estimation and uses a gradient descent approach that is guaranteed to converge.

M. Littman | A. Greenwald | Vincent Kubala

[1] Wolfram Burgard,et al. Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics , 2016, AISTATS.

[2] Jess Benhabib,et al. Present-bias, quasi-hyperbolic discounting, and fixed costs , 2010, Games Econ. Behav..

[3] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[4] J. Tirole,et al. Intrinsic and Extrinsic Motivation , 2003 .

[5] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[6] Colin Camerer,et al. Experience‐weighted Attraction Learning in Normal Form Games , 1999 .

[7] Anders R. Kristensen,et al. Dynamic programming and Markov decision processes , 1996 .

[8] Richard T. Boylan,et al. Fictitious Play: A Statistical Study of Multiple Economic Experiments , 1993 .