Balancing Unpredictability and Coverage in Adversarial Patrolling Settings

We present a novel strategy for a patroller defending a set of heterogeneous assets from the attacks carried by an attacker that through repeated observations attempts to learn the strategy followed by the patroller. Implemented through a Markov chain whose stationary distribution is a function of the values of the assets being defended and the topology of the environment, the strategy is biased towards providing more protection to valuable assets, yet is provably hard to learn for an opponent. After having studied its properties, we show that our proposed method outperforms strategies commonly used for this type of problems.

[1]  Steve Alpern,et al.  Patrolling Games , 2011, Oper. Res..

[2]  Sheldon M. Ross,et al.  Introduction to probability models , 1975 .

[3]  Milind Tambe,et al.  Using Abstractions to Solve Opportunistic Crime Security Games at Scale , 2016, AAMAS.

[4]  Elaheh Fata,et al.  Persistent monitoring in discrete environments: Minimizing the maximum weighted latency between observations , 2012, Int. J. Robotics Res..

[5]  Francesco Bullo,et al.  Robotic surveillance and Markov chains with minimal first passage time , 2014, 53rd IEEE Conference on Decision and Control.

[6]  Russ Bubley,et al.  Randomized algorithms , 1995, CSUR.

[7]  Gianluca Bontempi,et al.  Machine Learning Strategies for Time Series Forecasting , 2012, eBISS.

[8]  Milind Tambe,et al.  Security and Game Theory - Algorithms, Deployed Systems, Lessons Learned , 2011 .

[9]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[10]  Nicolas Privault,et al.  Understanding Markov Chains , 2013 .

[11]  Bo An,et al.  Security games with surveillance cost and optimal timing of attack execution , 2013, AAMAS.

[12]  Yann Chevaleyre,et al.  Theoretical analysis of the multi-agent patrolling problem , 2004, Proceedings. IEEE/WIC/ACM International Conference on Intelligent Agent Technology, 2004. (IAT 2004)..

[13]  Ariel D. Procaccia,et al.  Lazy Defenders Are Almost Optimal against Diligent Attackers , 2014, AAAI.

[14]  Jonathan D. Cryer,et al.  Time Series Analysis , 1986, Encyclopedia of Big Data.

[15]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[16]  Peter Stone,et al.  Reinforcement learning , 2019, Scholarpedia.

[17]  J. Baillieul,et al.  Stochastic Strategies for Autonomous Robotic Surveillance , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[18]  Simon Lacroix,et al.  Multi-robot target detection and tracking: taxonomy and survey , 2016, Auton. Robots.

[19]  Nicola Basilico,et al.  Patrolling security games: Definition and algorithms for solving large instances with single patroller and single intruder , 2012, Artif. Intell..