论文信息 - Theory and practice of coordination algorithms exploiting the generalised distributive law

Theory and practice of coordination algorithms exploiting the generalised distributive law

A key challenge for modern computer science is the development of technologies that allow interacting computer systems, typically referred as agents, to coordinate their decisions whilst operating in an environment with minimal human intervention. By so doing, the decision making capabilities of each of these agents should be improved by making decisions that take into account what the remaining agents intend to do. Against this background, the focus of this thesis is to study and design new coordination algorithms capable of achieving this improved performance. In this line of work, there are two key research challenges that need to be addressed. First, the current state-of-the-art coordination algorithms have only been tested in simulation. This means that their practical performance still needs to be demonstrated in the real world. Second, none of the existing algorithms are capable of solving problems where the agents need to coordinate over complex decisions which typically require to trade off several parameters such as multiple objectives, the parameters of a sufficient statistic and the sample value and the bounds of an estimator. However, such parameters typically characterise the agents’ interactions within many real world domains. For this reason, deriving algorithms capable of addressing such complex interactions is a key challenge to bring research in coordination algorithms one step closer to successful deployment. The aim of this thesis is to address these two challenges. To achieve this, we make two types of contribution. First, we develop a set practical contributions to address the challenge of testing the performance of state-of-the-art coordination algorithms in the real world. More specifically, we perform a case study on the deployment of the max-sum algorithm, a well known coordination algorithm, on a system that is couched in terms of allowing the first responders at the scene of a disaster to request imagery collection tasks of some of the most relevant areas to a team of unmanned aerial vehicles (UAVs). These agents then coordinate to complete the largest number of tasks. In more detail, max-sum is based on the generalised distributive law (GDL), a well known algebraic framework that has been used in disciplines such as artificial intelligence, machine learning and statistical physics, to derive effective algorithms to solve optimisation problems. Our iv contribution is the deployment of max-sum on real hardware and the evaluation of its performance in a real world setting. More specifically, we deploy max-sum on two UAVs (hexacopters) and test it a number of different settings. These tests show that max-sum does indeed perform well when confronted with the complexity and the unpredictability of the real world. The second category of contributions are theoretical in nature. More specifically, we propose a new framework and a set of solution techniques to address the complex interactions requirement. To achieve this, we move back to theory and tackle a new class of problem involving agents engaged in complex interactions defined by multiple parameters. We name this class partially ordered distributed constraint optimisation problems (PO-DCOPs). Essentially, this generalises the well known distributed constraint optimisation problem (DCOP) framework to settings in which agents make decisions over multiple parameters such as multiple objectives, the parameters of a sufficient statistic and the sample value and the bounds of an estimator. To measure the quality of these decisions, it becomes necessary to strike a balance between these parameters and to achieve this, the outcome of these decisions is represented using partially ordered constraint functions. Given this framework, we present three sub-classes of PO-DCOPs, each focusing on a different type of complex interaction. More specifically, we study (i) multi-objective DCOPs (MO-DCOPs) in which the agents’ decisions are defined over multiple objectives, (ii) risk-aware DCOPs (RA-DCOPs) in which the outcome of the agents’ decisions is not known with certainty and thus, where the agents need to carefully weigh the risk of making decisions that might lead to poor and unexpected outcomes and, (iii) multiarm bandit DCOPs (MAB-DCOPs) where the agents need to learn the outcome of their decisions online. To solve these problems, we again exploit the GDL framework. In particular, we employ the flexibility of the GDL to obtain either optimal or bounded approximate algorithms to solve PO-DCOPs. The key insight is to use the algebraic properties of the GDL to instantiate well known DCOP algorithms such as DPOP, Action GDL or bounded max-sum to solve PO-DCOPs. Given the properties of these algorithms, we derive a new set of solution techniques. To demonstrate their effectiveness, we study the properties of these algorithms empirically on various instances of MO-DCOPs, RA-DCOPs and MAB-DCOPs. Our experiments emphasize two key traits of the algorithms. First, bounded approximate algorithms perform well in terms of our requirements. Second, optimal algorithms incur an increase in both the computation and communication load necessary to solve PO-DCOPs because they are trying to optimally solve a problem which is potentially more complex than canonical DCOPs.

Francesco Maria Delle Fave | F. D. Fave

[1] Hugh F. Durrant-Whyte,et al. Recursive Bayesian search-and-tracking using coordinated uavs for lost targets , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[2] Gul A. Agha,et al. Market-based coordination strategies for physical multi-agent systems , 2008, SIGBED.

[3] Paul Scerri,et al. Coordinating very large groups of wide area search munitions , 2004 .

[4] Salah Sukkarieh,et al. Building a Robust Implementation of Bearing‐only Inertial SLAM for a UAV , 2007, J. Field Robotics.

[5] Prashant Doshi,et al. Monte Carlo Sampling Methods for Approximating Interactive POMDPs , 2014, J. Artif. Intell. Res..

[6] Geoffrey A. Hollinger,et al. Efficient Multi-robot Search for a Moving Target , 2009, Int. J. Robotics Res..

[7] S. Shankar Sastry,et al. Pursuit-evasion strategies for teams of multiple agents with incomplete information , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[8] Agathoniki Trigoni,et al. Probabilistic search with agile UAVs , 2010, 2010 IEEE International Conference on Robotics and Automation.

[9] Delbert Dueck,et al. Clustering by Passing Messages Between Data Points , 2007, Science.

[10] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[11] Shlomo Zilberstein,et al. Increasing scalability in algorithms for centralized and decentralized partially observable markov decision processes: efficient decision-making and coordination in uncertain environments , 2010 .

[12] Yan Jin,et al. Balancing search and target response in cooperative UAV teams , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[13] Sebastian Thrun,et al. Stanley: The robot that won the DARPA Grand Challenge , 2006, J. Field Robotics.

[14] Salah Sukkarieh,et al. Airborne simultaneous localisation and map building , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[15] Michael L. Littman,et al. A tutorial on partially observable Markov decision processes , 2009 .

[16] Randal W. Beard,et al. Decentralized Perimeter Surveillance Using a Team of UAVs , 2005, IEEE Transactions on Robotics.

[17] Agathoniki Trigoni,et al. Probabilistic target detection by camera-equipped UAVs , 2010, 2010 IEEE International Conference on Robotics and Automation.

[18] Agathoniki Trigoni,et al. Supporting Search and Rescue Operations with UAVs , 2010, 2010 International Conference on Emerging Security Technologies.

[19] Mark Campbell,et al. On-Line Estimation and Path Planning for Multiple Vehicles in an Uncertain Environment , 2002 .

[20] Nicholas R. Jennings,et al. A Decentralised Coordination Algorithm for Mobile Sensors , 2010, AAAI.

[21] Nidhi Kalra,et al. Market-Based Multirobot Coordination: A Survey and Analysis , 2006, Proceedings of the IEEE.

[22] D.J.C. MacKay,et al. Good error-correcting codes based on very sparse matrices , 1997, Proceedings of IEEE International Symposium on Information Theory.

[23] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 1985 .

[24] S. Shankar Sastry,et al. Pursuit-evasion games with unmanned ground and aerial vehicles , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[25] Sarvapali D. Ramchurn,et al. Trading agents for the smart electricity grid , 2010, AAMAS.

[26] Sarvapali D. Ramchurn,et al. Optimal decentralised dispatch of embedded generation in the smart grid , 2012, AAMAS.

[27] J.P. Hespanha,et al. Multiple-agent probabilistic pursuit-evasion games , 1999, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304).

[28] Nicholas R. Jennings,et al. Decentralised Coordination of Mobile Sensors Using the Max-Sum Algorithm , 2009, IJCAI.

[29] Ugur Zengin,et al. Real-Time Target Tracking for Autonomous UAVs in Adversarial Environments: A Gradient Search Algorithm , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[30] Milind Tambe,et al. A Family of Graphical-Game-Based Algorithms for Distributed Constraint Optimization Problems , 2006 .

[31] Jasbir S. Arora,et al. Survey of multi-objective optimization methods for engineering , 2004 .

[32] Hugh F. Durrant-Whyte,et al. The element-based method - theory and its application to bayesian search and tracking - , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[33] M. Mézard,et al. Analytic and Algorithmic Solution of Random Satisfiability Problems , 2002, Science.

[34] Boi Faltings,et al. Distributed constraint optimization with structured resource constraints , 2009, AAMAS.

[35] Jonathan P. How,et al. UAV Task Assignment , 2008, IEEE Robotics & Automation Magazine.

[36] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..

[37] Milind Tambe,et al. How local is that optimum? k-optimality for DCOP , 2005, AAMAS '05.

[38] Robert J. McEliece,et al. The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[39] Kimon P. Valavanis,et al. Advances in Unmanned Aerial Vehicles: State of the Art and the Road to Autonomy , 2007 .

[40] Michael I. Jordan,et al. Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[41] X. Jin. Factor graphs and the Sum-Product Algorithm , 2002 .

[42] Makoto Yokoo,et al. Multiply-constrained distributed constraint optimization , 2006, AAMAS '06.

[43] Eric W. Frew,et al. Coordinated Standoff Tracking of Moving Targets Using Lyapunov Guidance Vector Fields , 2008 .

[44] Makoto Yokoo,et al. Adopt: asynchronous distributed constraint optimization with quality guarantees , 2005, Artif. Intell..

[45] Sarvapali D. Ramchurn,et al. A Distributed Anytime Algorithm for Dynamic Task Allocation in Multi-Agent Systems , 2011, AAAI.

[46] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[47] R. Rysdyk,et al. Flight path following guidance for unmanned air vehicles with pan-tilt camera for target observation , 2003, Digital Avionics Systems Conference, 2003. DASC '03. The 22nd.

[48] Daoying Ma,et al. CoreTracking: an efficient approach to clustering moving targets and tracking clusters , 2004, Proceedings of the 2004 IEEE Radar Conference (IEEE Cat. No.04CH37509).

[49] Marko Bacic,et al. Model predictive control , 2003 .

[50] Gang Xu,et al. Epipolar Geometry in Stereo, Motion and Object Recognition , 1996, Computational Imaging and Vision.

[51] Keith S. Decker,et al. Coordination for uncertain outcomes using distributed neighbor exchange , 2010, AAMAS.

[52] Meritxell Vinyals,et al. Constructing a unifying theory of dynamic programming DCOP algorithms via the generalized distributive law , 2010, Autonomous Agents and Multi-Agent Systems.

[53] Colin McDiarmid,et al. Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[54] Makoto Yokoo,et al. When should there be a "Me" in "Team"?: distributed multi-agent optimization under uncertainty , 2010, AAMAS.

[55] Wolfram Burgard,et al. Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) , 2005 .

[56] Brendan J. Frey,et al. Factor Graphs and Algorithms , 2008 .

[57] Radu Marinescu,et al. Exploiting Problem Decomposition in Multi-objective Constraint Optimization , 2009, CP.

[58] Javier Larrosa,et al. Bucket elimination for multiobjective optimization problems , 2006, J. Heuristics.

[59] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[60] Sarvapali D. Ramchurn,et al. Efficient, Superstabilizing Decentralised Optimisation for Dynamic Task Allocation Environments , 2010 .

[61] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.

[62] Pierre A. Humblet,et al. A Distributed Algorithm for Minimum-Weight Spanning Trees , 1979, TOPL.

[63] S. Palaniammal. Probability and Queueing Theory , 2011 .

[64] H. Levy. Stochastic Dominance: Investment Decision Making under Uncertainty , 2010 .

[65] Emma Rollón,et al. Multi-objective optimization in graphical models , 2008 .

[66] Eric W. Frew,et al. Target assignment for integrated search and tracking by active robot networks , 2008, 2008 IEEE International Conference on Robotics and Automation.

[67] Nicholas R. Jennings,et al. Bounded approximate decentralised coordination via the max-sum algorithm , 2009, Artif. Intell..

[68] Boi Faltings,et al. A Scalable Method for Multiagent Constraint Optimization , 2005, IJCAI.

[69] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[70] Hugh F. Durrant-Whyte,et al. Decentralized Bayesian negotiation for cooperative search , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[71] Mehryar Mohri,et al. Multi-armed Bandit Algorithms and Empirical Evaluation , 2005, ECML.

[72] Nicholas R. Jennings,et al. Agent Technologies for Sensor Networks , 2009, IEEE Intelligent Systems.

[73] Yanli Yang,et al. Evidential map-building approaches for multi-UAV cooperative search , 2005, Proceedings of the 2005, American Control Conference, 2005..

[74] Richard R. Brooks,et al. Distributed Sensor Networks: A Multiagent Perspective , 2008 .

[75] Claire J. Tomlin,et al. Distributed Cooperative Search using Information-Theoretic Costs for Particle Filters, with Quadrotor Applications ∗ , 2006 .

[76] Hugh F. Durrant-Whyte,et al. Coordinated decentralized search for a lost target in a Bayesian world , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[77] Boi Faltings,et al. E[DPOP]: Distributed Constraint Optimization under Stochastic Uncertainty using Collaborative Sampling , 2009, IJCAI 2009.

[78] Christopher Geyer. Active target search from UAVs in urban environments , 2008, 2008 IEEE International Conference on Robotics and Automation.

[79] Günter Rudolph,et al. Parallel Approaches for Multiobjective Optimization , 2008, Multiobjective Optimization.

[80] NICHOLAS R. JENNINGS,et al. An agent-based approach for building complex software systems , 2001, CACM.

[81] Meritxell Vinyals,et al. Divide and Coordinate: solving DCOPs by agreement , 2010, AAMAS 2010.

[82] Makoto Yokoo,et al. DCOPs meet the realworld: exploring unknown reward matrices with applications to mobile sensor networks , 2009, IJCAI 2009.

[83] Boi Faltings,et al. Distributed Constraint Optimization Under Stochastic Uncertainty , 2011, AAAI.

[84] Nicholas R. Jennings,et al. Decentralised coordination of low-power embedded devices using the max-sum algorithm , 2008, AAMAS.

[85] Archie C. Chapman,et al. A unifying framework for iterative approximate best-response algorithms for distributed constraint optimization problems1 , 2011, The Knowledge Engineering Review.

[86] M. Alighanbari,et al. Decentralized Task Assignment for Unmanned Aerial Vehicles , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[87] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[88] Archie C. Chapman,et al. Decentralised dynamic task allocation: a practical game: theoretic approach , 2009, AAMAS.

[89] Milind Tambe,et al. Asynchronous algorithms for approximate distributed constraint optimization with quality bounds , 2010, AAMAS.

[90] J. Karl Hedrick,et al. A multiple UAV system for vision-based search and localization , 2008, 2008 American Control Conference.

[91] R.D. Braun,et al. The Mars airplane: a credible science platform , 2004, 2004 IEEE Aerospace Conference Proceedings (IEEE Cat. No.04TH8720).

[92] Eric Sommerlade,et al. Information-theoretic active scene exploration , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[93] Shobha Venkataraman,et al. Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..

[94] Aníbal Ollero,et al. Improving vision-based planar motion estimation for unmanned aerial vehicles through online mosaicing , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[95] Stephen Fitzpatrick,et al. Distributed Coordination through Anarchic Optimization , 2003 .

[96] Jonathan P. How,et al. Increasing autonomy of UAVs , 2009, IEEE Robotics & Automation Magazine.

[97] Adrian Petcu,et al. A Class of Algorithms for Distributed Constraint Optimization , 2009, Frontiers in Artificial Intelligence and Applications.

[98] Ruben Stranders. Decentralised coordination of information gathering agents , 2010 .

[99] Evan Sultanik,et al. On Modeling Multiagent Task Scheduling as a Distributed Constraint Optimization Problem , 2007, IJCAI.