Persistent Surveillance of Events with Unknown Rate Statistics

We present a novel algorithm for persistent monitoring of stochastic events that occur at discrete locations in the environment with unknown event rates. Prior research on persistent monitoring assumes knowledge of event rates, which is often not the case in robotics applications. We consider the multi-objective optimization of maximizing the total number of events observed in a balanced manner subject to real-world autonomous system constraints. We formulate an algorithm that quantifies and leverages uncertainty over events’ statistics to greedily generate adaptive policies that simultaneously consider learning and monitoring objectives. We analyze the favorable properties of our algorithm as a function of monitoring cycles and provide simulation results demonstrating our method’s effectiveness in real-world inspired monitoring applications.

[1]  Kenji Kawaguchi,et al.  Bounded Optimal Exploration in MDP , 2016, AAAI.

[2]  Peter Auer,et al.  UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem , 2010, Period. Math. Hung..

[3]  Elon Rimon,et al.  Competitive on-line coverage of grid environments by a mobile robot , 2003, Comput. Geom..

[4]  Gilbert Laporte,et al.  The orienteering problem with variable profits , 2013, Networks.

[5]  Hoong Chuin Lau,et al.  Orienteering Problem: A survey of recent variants, solution approaches and applications , 2016, Eur. J. Oper. Res..

[6]  Nathan Michael,et al.  Persistent surveillance with a team of MAVs , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Mac Schwager,et al.  Decentralized path planning for coverage tasks using gradient descent adaptive control , 2014, Int. J. Robotics Res..

[8]  Gaurav S. Sukhatme,et al.  Informative path planning for an autonomous underwater vehicle , 2010, 2010 IEEE International Conference on Robotics and Automation.

[9]  Mac Schwager,et al.  Planning periodic persistent monitoring trajectories for sensing robots in Gaussian Random Fields , 2013, 2013 IEEE International Conference on Robotics and Automation.

[10]  Alfred O. Hero,et al.  Information Theoretic Approaches to Sensor Management , 2008 .

[11]  Mac Schwager,et al.  Decentralized, Adaptive Coverage Control for Networked Robots , 2009, Int. J. Robotics Res..

[12]  Daniela Rus,et al.  Anytime planning of optimal schedules for a mobile sensing robot , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Luis Merino,et al.  Decentralized cooperation of multiple UAS for multi-target surveillance under uncertainties , 2014, 2014 International Conference on Unmanned Aircraft Systems (ICUAS).

[14]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[15]  Simeon C. Ntafos,et al.  Optimum watchman routes , 1986, SCG '86.

[16]  Mac Schwager,et al.  Correlated Orienteering Problem and its application to informative path planning for persistent monitoring tasks , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  Daniela Rus,et al.  Persistent monitoring of events with stochastic arrivals at multiple stations , 2013, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Gaurav S. Sukhatme,et al.  Persistent ocean monitoring with underwater gliders: Adapting sampling resolution , 2011, J. Field Robotics.

[19]  Munther A. Dahleh,et al.  Continuous path planning for a data harvesting mobile server , 2008, 2008 47th IEEE Conference on Decision and Control.

[20]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[21]  D. Anderson,et al.  Algorithms for minimization without derivatives , 1974 .

[22]  Gaston H. Gonnet,et al.  On the LambertW function , 1996, Adv. Comput. Math..

[23]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Geoffrey A. Hollinger,et al.  Sampling-based Motion Planning for Robotic Information Gathering , 2013, Robotics: Science and Systems.

[25]  Elaheh Fata,et al.  Persistent monitoring in discrete environments: Minimizing the maximum weighted latency between observations , 2012, Int. J. Robotics Res..

[26]  Michael Short,et al.  Improved Inequalities for the Poisson and Binomial Distribution and Upper Tail Quantile Functions , 2013 .

[27]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[28]  Mac Schwager,et al.  Generating informative paths for persistent sensing in unknown environments , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[29]  Ying He,et al.  Sensor scheduling for target tracking in sensor networks , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[30]  Mac Schwager,et al.  Persistent Robotic Tasks: Monitoring and Sweeping in Changing Environments , 2011, IEEE Transactions on Robotics.

[31]  Peter Auer,et al.  Logarithmic Online Regret Bounds for Undiscounted Reinforcement Learning , 2006, NIPS.

[32]  Geoffrey A. Hollinger,et al.  Underwater Data Collection Using Robotic Sensor Networks , 2012, IEEE Journal on Selected Areas in Communications.

[33]  Peter Auer,et al.  Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..