论文信息 - Modeling Interaction via the Principle of Maximum Causal Entropy - 字舞流文

Modeling Interaction via the Principle of Maximum Causal Entropy

The principle of maximum entropy provides a powerful framework for statistical models of joint, conditional, and marginal distributions. However, there are many important distributions with elements of interaction and feedback where its applicability has not been established. This work presents the principle of maximum causal entropy—an approach based on causally conditioned probabilities that can appropriately model the availability and influence of sequentially revealed side information. Using this principle, we derive models for sequential data with revealed information, interaction, and feedback, and demonstrate their applicability for statistically framing inverse optimal control and decision prediction tasks.

Anind K. Dey | J. Andrew Bagnell | Brian D. Ziebart | J. Bagnell | A. Dey

[1] E. Jaynes. Information Theory and Statistical Mechanics , 1957 .

[2] R. Bellman. A Markovian Decision Process , 1957 .

[3] R. E. Kalman,et al. When Is a Linear Control System Optimal , 1964 .

[4] T. D. Parsons,et al. Pursuit-evasion in a graph , 1978 .

[5] J. Massey. CAUSALITY, FEEDBACK AND DIRECTED INFORMATION , 1990 .

[6] David Heckerman,et al. Troubleshooting Under Uncertainty , 1994 .

[7] Gerhard Kramer,et al. Directed information for channels with feedback , 1998 .

[8] E. Yaz. Linear Matrix Inequalities In System And Control Theory , 1998, Proceedings of the IEEE.

[9] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .

[10] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[11] Sekhar Tatikonda,et al. Control under communication constraints , 2004, IEEE Transactions on Automatic Control.

[12] A. Dawid,et al. Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory , 2004, math/0410076.

[13] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[14] Ronald A. Howard,et al. Influence Diagrams , 2005, Decis. Anal..

[15] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[16] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.

[17] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[18] Miroslav Dudík,et al. Maximum Entropy Distribution Estimation with Generalized Regularization , 2006, COLT.

[19] Luis E. Ortiz,et al. Maximum Entropy Correlated Equilibria , 2007, AISTATS.

[20] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[21] Haim H. Permuter,et al. On directed information and gambling , 2008, 2008 IEEE International Symposium on Information Theory.

[22] Joshua B. Tenenbaum,et al. Help or Hinder: Bayesian Models of Social Goal Inference , 2009, NIPS.

[23] David Silver,et al. Learning to search: Functional gradient techniques for imitation learning , 2009, Auton. Robots.

[24] E. Ostertag. Linear Matrix Inequalities , 2011 .