Inverse Reinforcement Learning Based Human Behavior Modeling for Goal Recognition in Dynamic Local Network Interdiction

Goal recognition is the task of inferring an agent’s goals given some or all of the agent’s observed actions. Among different ways of problem formulation, goal recognition can be solved as a model-based planning problem using off-theshell planners. However, obtaining accurate cost or reward models of an agent and incorporating them into the planning model becomes an issue in real applications. Towards this end, we propose an Inverse Reinforcement Learning (IRL)based opponent behavior modeling method, and apply it in the goal recognition assisted Dynamic Local Network Interdiction (DLNI) problem. We first introduce the overall framework and the DLNI problem domain of our work. After that, an IRL-based human behavior modeling method and Markov Decision Process-based goal recognition are introduced. Experimental results indicate that our learned behavior model has a higher tracking accuracy and yields better interdiction outcomes than other models.

[1]  James F. Allen,et al.  A Plan Recognition Model for Subdialogues in Conversations , 1987, Cogn. Sci..

[2]  R. Kevin Wood,et al.  Shortest‐path network interdiction , 2002, Networks.

[3]  Johan Löfberg,et al.  YALMIP : a toolbox for modeling and optimization in MATLAB , 2004 .

[4]  Kai Xu,et al.  Bridging the Gap between Observation and Decision Making: Goal Recognition and Flexible Resource Allocation in Dynamic Network Interdiction , 2017, IJCAI.

[5]  Henry A. Kautz,et al.  Learning and inferring transportation routines , 2004, Artif. Intell..

[6]  Maria Paola Scaparra,et al.  Optimal Allocation of Protective Resources in Shortest-Path Networks , 2011, Transp. Sci..

[7]  J. Cruz,et al.  On the Stackelberg strategy in nonzero-sum games , 1973 .

[8]  Svetha Venkatesh,et al.  Policy Recognition in the Abstract Hidden Markov Model , 2002, J. Artif. Intell. Res..

[9]  Chris L. Baker,et al.  Action understanding as inverse planning , 2009, Cognition.

[10]  Thomas Kirste,et al.  A Decentralized Partially Observable Decision Model for Recognizing the Multiagent Goal in Simulation Systems , 2016 .

[11]  Quanjun Yin,et al.  A Semi-Markov Decision Model for Recognizing the Destination of a Maneuvering Agent in Real Time Strategy Games , 2016 .

[12]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[13]  Pieter Abbeel,et al.  An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[14]  Gita Reese Sukthankar,et al.  Learning to intercept opponents in first person shooter games , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).

[15]  Hanif D. Sherali,et al.  A Dynamic Network Interdiction Problem , 2010, Informatica.

[16]  Brian Charles Williams,et al.  Intent Recognition for Human-Robot Interaction , 2007, Interaction Challenges for Intelligent Assistants.

[17]  Gilbert Laporte,et al.  A game theoretic framework for the robust railway transit network design problem , 2010 .

[18]  Hector Geffner,et al.  Goal Recognition over POMDPs: Inferring the Intention of a POMDP Agent , 2011, IJCAI.

[19]  Hugo Larochelle,et al.  Using a Recursive Neural Network to Learn an Agent's Decision Model for Plan Recognition , 2015, IJCAI.

[20]  Michael P. Wellman,et al.  Generalized Queries on Probabilistic Context-Free Grammars , 1996, AAAI/IAAI, Vol. 2.

[21]  Gal A. Kaminka,et al.  Fast and Complete Symbolic Plan Recognition , 2005, IJCAI.

[22]  Raymond J. Mooney,et al.  Plan, Activity, and Intent Recognition: Theory and Practice , 2014 .

[23]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[24]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[25]  Jonathan P. Rowe,et al.  Deep Learning-Based Goal Recognition in Open-Ended Digital Games , 2014, AIIDE.

[26]  Matthieu Geist,et al.  User Simulation in Dialogue Systems Using Inverse Reinforcement Learning , 2011, INTERSPEECH.

[27]  Richard L. Church,et al.  A bilevel mixed-integer program for critical infrastructure protection planning , 2008, Comput. Oper. Res..

[28]  Robert P. Goldman,et al.  Plan recognition in intrusion detection systems , 2001, Proceedings DARPA Information Survivability Conference and Exposition II. DISCEX'01.