Heuristic Search Value Iteration for One-Sided Partially Observable Stochastic Games

Security problems can be modeled as two-player partially observable stochastic games with one-sided partial observability and infinite horizon (one-sided POSGs). We seek for optimal strategies of player 1 that correspond to robust strategies against the worst-case opponent (player 2) that is assumed to have a perfect information about the game. We present a novel algorithm for approximately solving onesided POSGs based on the heuristic search value iteration (HSVI) for POMDPs. Our results include (1) theoretical properties of one-sided POSGs and their value functions, (2) guarantees showing the convergence of our algorithm to optimal strategies, and (3) practical demonstration of applicability and scalability of our algorithm on three different domains: pursuit-evasion, patrolling, and search games.

[1]  Branislav Bosanský,et al.  A Point-Based Approximate Algorithm for One-Sided Partially Observable Pursuit-Evasion Games , 2016, GameSec.

[2]  Branislav Bosanský,et al.  An Exact Double-Oracle Algorithm for Zero-Sum Extensive-Form Games with Imperfect Information , 2014, J. Artif. Intell. Res..

[3]  Bo An,et al.  PROTECT: a deployed game theoretic system to protect the ports of the United States , 2012, AAMAS.

[4]  Nicola Basilico,et al.  Leader-follower strategies for robotic patrolling in environments with arbitrary topologies , 2009, AAMAS.

[5]  Branislav Bosanský,et al.  Dynamic Programming for One-sided Partially Observable Pursuit-evasion Games , 2017, ICAART.

[6]  Sarit Kraus,et al.  Deployed ARMOR protection: the application of a game theoretic model for security at the Los Angeles International Airport , 2008, AAMAS 2008.

[7]  Anne Condon,et al.  On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.

[8]  Milind Tambe,et al.  When Security Games Go Green: Designing Defender Strategies to Prevent Poaching and Illegal Fishing , 2015, IJCAI.

[9]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[10]  Reid G. Simmons,et al.  Heuristic Search Value Iteration for POMDPs , 2004, UAI.

[11]  J. Neumann Zur Theorie der Gesellschaftsspiele , 1928 .

[12]  Geoffrey A. Hollinger,et al.  Search and pursuit-evasion in mobile robotics , 2011, Auton. Robots.

[13]  Bo An,et al.  Computing Solutions in Infinite-Horizon Discounted Adversarial Patrolling Games , 2014, ICAPS.

[14]  Branislav Bosanský,et al.  Game-theoretic resource allocation for malicious packet detection in computer networks , 2012, AAMAS.

[15]  Manish Jain,et al.  Computing optimal randomized resource allocations for massive security games , 2009, AAMAS 2009.

[16]  Krzysztof Ciesielski,et al.  On Stefan Banach and some of his results , 2007 .

[17]  Shlomo Zilberstein,et al.  Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.

[18]  Thomas C. Hales,et al.  Historical Overview of the Kepler Conjecture , 2006, Discret. Comput. Geom..

[19]  H. Nikaidô On von Neumann’s minimax theorem , 1954 .

[20]  Bo An,et al.  Deploying PAWS: Field Optimization of the Protection Assistant for Wildlife Security , 2016, AAAI.

[21]  Nicola Basilico,et al.  A Security Game Combining Patrolling and Alarm-Triggered Responses Under Spatial and Detection Uncertainties , 2016, AAAI.