Dual-Mandate Patrols: Multi-Armed Bandits for Green Security

Conservation efforts in green security domains to protect wildlife and forests are constrained by the limited availability of defenders (i.e., patrollers), who must patrol vast areas to protect from attackers (e.g., poachers or illegal loggers). Defenders must choose how much time to spend in each region of the protected area, balancing exploration of infrequently visited regions and exploitation of known hotspots. We formulate the problem as a stochastic multi-armed bandit, where each action represents a patrol strategy, enabling us to guarantee the rate of convergence of the patrolling policy. However, a naive bandit approach would compromise short-term performance for long-term optimality, resulting in animals poached and forests destroyed. To speed up performance, we leverage smoothness in the reward function and decomposability of actions. We show a synergy between Lipschitz-continuity and decomposition as each aids the convergence of the other. In doing so, we bridge the gap between combinatorial and Lipschitz bandits, presenting a no-regret approach that tightens existing guarantees while optimizing for short-term performance. We demonstrate that our algorithm, LIZARD, improves performance on real-world poaching data from Cambodia.

[1]  Alexandre Proutière,et al.  Lipschitz Bandits: Regret Lower Bound and Optimal Algorithms , 2014, COLT.

[2]  Nicola Basilico,et al.  Leader-follower strategies for robotic patrolling in environments with arbitrary topologies , 2009, AAMAS.

[3]  Nicholas R. Jennings,et al.  Playing Repeated Security Games with No Prior Knowledge , 2016, AAMAS.

[4]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[5]  Milind Tambe,et al.  When Security Games Go Green: Designing Defender Strategies to Prevent Poaching and Illegal Fishing , 2015, IJCAI.

[6]  Haifeng Xu,et al.  Optimal Patrol Planning for Green Security Games with Black-Box Attackers , 2017, GameSec.

[7]  A. Lemieux,et al.  Poaching in Uganda: Perspectives of Law Enforcement Rangers , 2015 .

[8]  A. Plumptre,et al.  Spatiotemporal trends of illegal activities from ranger‐collected data in a Ugandan national park , 2015, Conservation biology : the journal of the Society for Conservation Biology.

[9]  Long Tran-Thanh,et al.  Don't Put All Your Strategies in One Basket: Playing Green Security Games with Imperfect Prior Knowledge , 2019, AAMAS.

[10]  Csaba Szepesvári,et al.  –armed Bandits , 2022 .

[11]  Milind Tambe,et al.  Stay Ahead of Poachers: Illegal Wildlife Poaching Prediction and Patrol Planning Under Uncertainty with Field Test Evaluations (Short Version) , 2019, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[12]  Bo An,et al.  Deploying PAWS: Field Optimization of the Protection Assistant for Wildlife Security , 2016, AAAI.

[13]  Milind Tambe,et al.  Adversary Models Account for Imperfect Crime Data: Forecasting and Planning against Real-world Poachers , 2018, AAMAS.

[14]  Yajun Wang,et al.  Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms , 2014, J. Mach. Learn. Res..

[15]  Robert D. Kleinberg Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.

[16]  Thorsten Joachims,et al.  Multi-armed Bandit Problems with History , 2012, AISTATS.

[17]  Santiago Ontañón,et al.  Regression Oracles and Exploration Strategies for Short-Horizon Multi-Armed Bandits , 2020, 2020 IEEE Conference on Games (CoG).

[18]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..

[19]  Milind Tambe,et al.  CAPTURE: A New Predictive Anti-Poaching Tool for Wildlife Protection , 2016, AAMAS.

[20]  Eli Upfal,et al.  Bandits and Experts in Metric Spaces , 2013, J. ACM.

[21]  Milind Tambe,et al.  Robust Protection of Fisheries with COmPASS , 2014, AAAI.

[22]  Csaba Szepesvari,et al.  Bandit Algorithms , 2020 .

[23]  Yan Liu,et al.  Policy Learning for Continuous Space Security Games Using Neural Networks , 2018, AAAI.

[24]  Eli Upfal,et al.  Multi-Armed Bandits in Metric Spaces ∗ , 2008 .

[25]  Douglas J. Lober Using forest guards to protect a biological reserve in Costa Rica: one step towards linking parks to people , 1992 .

[26]  Milind Tambe,et al.  Preventing Illegal Logging: Simultaneous Optimization of Resource Teams and Tactics for Security , 2016, AAAI.

[27]  Ariel D. Procaccia,et al.  Learning Optimal Commitment to Overcome Insecurity , 2014, NIPS.

[28]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[29]  Grace Nangendo,et al.  Efficiently targeting resources to deter illegal activities in protected areas , 2014 .