Asynchronous Decision Making for Decentralised Autonomous Systems

George M. Mathews Doctor of Philosophy The University of Sydney March 2008 Asynchronous Decision Making for Decentralised Autonomous Systems This thesis is concerned with the design of decentralised controllers for distributed autonomous systems. This problem is encountered in any system which consists of multiple interacting autonomous entities that are required to make local decisions while pursuing a common objective. This thesis has its foundations in the design of optimal controllers for general decentralised stochastic systems. To overcome the computational complexity of the optimal solutions, a scheme is proposed based on open loop feedback control. Under this scheme each controller maintains a synchronised belief about the state of the system and collaborates with the other controllers in developing an open loop plan. The planning problem is formulated as an optimisation problem over the set of all individual control plans. Classical solution methods to this problem require all the information to be communicated to a single location, where standard algorithms are used to compute the optimal set of plans. Although this may be appropriate for small systems, it becomes computationally infeasible for large system containing many controllers and decentralised methods must be used. The main contribution of this thesis is a decentralised and asynchronous algorithm for this optimisation problem. Attention is focused on continuous problems, where the expected cost-to-go is a smooth function of the individual plans of each controller and gradient-based methods can be used. The proposed asynchronous algorithm allows each controller to generate an initial plan and then incrementally refine it, while intermittently communicating these refinements to the other controllers in the system. This method explicitly takes into account the possible communication delays between controllers. A convergence analysis is performed and a condition derived that intuitively relates these delays and the inter-controller coupling structure to the rate at which each controller can refine their local plan. The coupling between two controllers is determined by the curvature of the objective function in the subspace containing their local plans. If the coupling is known to the controllers, it can be used to prioritise the use of the available communication bandwidth, with only the highly coupled controllers communicating. However, the coupling structure requires detailed information on the 2nd order derivatives of the objective function and is generally not feasible to compute. To overcome this, an approximation method is developed that allows each controller to estimate the coupling. This allows each controller to dynamically determine which other controllers it must communicate with. Finally, this scheme is applied to the control of multi-sensor systems undertaking search and tracking tasks. For these sensing systems it is demonstrated that the inter-controller coupling can be interpreted as the overlap in the information obtained by the future observations of each sensor. It is also shown that this often results in a sparse coupling structure, with each controller only required to communicate with a few other coupled controllers.

[1]  Ben Grocholsky,et al.  Information-Theoretic Control of Multiple Sensor Platforms , 2002 .

[2]  J. Karl Hedrick,et al.  Path Planning for Cooperative Sensing Using Unmanned Vehicles , 2007 .

[3]  Jonathan P. How,et al.  Aircraft trajectory planning with collision avoidance using mixed integer linear programming , 2002, Proceedings of the 2002 American Control Conference (IEEE Cat. No.CH37301).

[4]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[5]  Lawrence D. Stone OR Forum - What's Happened in Search Theory Since the 1975 Lanchester Prize? , 1989, Oper. Res..

[6]  Stefan B. Williams,et al.  A decentralized architecture for Active Sensor Networks , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[7]  D. Georges Decentralized adaptive control for a water distribution system , 1994, 1994 Proceedings of IEEE International Conference on Control and Applications.

[8]  Makoto Yokoo,et al.  Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.

[9]  Michael Patriksson,et al.  Decomposition Methods for Differentiable Optimization Problems over Cartesian Product Sets , 1998, Comput. Optim. Appl..

[10]  Y. Ho,et al.  Team decision theory and information structures in optimal control problems--Part II , 1972 .

[11]  Joel W. Burdick,et al.  A Decision-Making Framework for Control Strategies in Probabilistic Search , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[12]  Shlomo Zilberstein,et al.  Formal models and algorithms for decentralized decision making under uncertainty , 2008, Autonomous Agents and Multi-Agent Systems.

[13]  Weiming Shen,et al.  Distributed Manufacturing Scheduling Using Intelligent Agents , 2002, IEEE Intell. Syst..

[14]  Eduardo Camponogara,et al.  Distributed Model Predictive Control: Synchronous and Asynchronous Computation , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[15]  Yu-Chi Ho,et al.  Team decision theory and information structures , 1980 .

[16]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC , 2005, Eur. J. Control.

[17]  Ross D. Shachter,et al.  Influence Diagrams for Team Decision Analysis , 2005, Decis. Anal..

[18]  H. Durrant-Whyte,et al.  The ANSER Project: Data Fusion Across Multiple Uninhabited Air Vehicles , 2003 .

[19]  Deep Medhi,et al.  Network routing - algorithms, protocols, and architectures , 2007 .

[20]  Naomi Ehrich Leonard,et al.  Exploring scalar fields using multiple sensor platforms: Tracking level curves , 2007, 2007 46th IEEE Conference on Decision and Control.

[21]  James Llinas,et al.  Handbook of Multisensor Data Fusion , 2001 .

[22]  Randy A. Freeman,et al.  Distributed Cooperative Active Sensing Using Consensus Filters , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[23]  Thia Kirubarajan,et al.  Estimation with Applications to Tracking and Navigation: Theory, Algorithms and Software , 2001 .

[24]  J. Hedrick,et al.  String stability of interconnected systems , 1995, Proceedings of 1995 American Control Conference - ACC'95.

[25]  Vijay Kumar,et al.  Synergies in Feature Localization by Air-Ground Robot Teams , 2004, ISER.

[26]  Gérard M. Baudet,et al.  Asynchronous Iterative Methods for Multiprocessors , 1978, JACM.

[27]  T. Furukawa Time-Subminimal Trajectory Planning for Discrete Non-linear Systems , 2002 .

[28]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[29]  Stephen P. Boyd,et al.  Distributed optimization for cooperative agents: application to formation flight , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[30]  Rajesh Rajamani,et al.  Demonstration of an automated highway platoon system , 1998, Proceedings of the 1998 American Control Conference. ACC (IEEE Cat. No.98CH36207).

[31]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[32]  Makoto Yokoo,et al.  Adopt: asynchronous distributed constraint optimization with quality guarantees , 2005, Artif. Intell..

[33]  Paul Tseng,et al.  On the Rate of Convergence of a Partially Asynchronous Gradient Projection Algorithm , 1991, SIAM J. Optim..

[34]  Jason L. Williams,et al.  Information theoretic sensor management , 2007 .

[35]  Stelios C. A. Thomopoulos,et al.  Distributed Fusion Architectures and Algorithms for Target Tracking , 1997, Proc. IEEE.

[36]  Tomonari Furukawa,et al.  Dynamic Search Spaces for Coordinated Autonomous Marine Search and Tracking , 2007, IEA/AIE.

[37]  Andrew Scott,et al.  Self-Organising impact sensing networks in robust aerospace vehicles , 2006 .

[38]  R. Radner,et al.  Economic theory of teams , 1972 .

[39]  Hugh Durrant-Whyte,et al.  Data Fusion and Sensor Management: A Decentralized Information-Theoretic Approach , 1995 .

[40]  Hugh F. Durrant-Whyte,et al.  Decentralized Bayesian negotiation for cooperative search , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[41]  Claude E. Shannon,et al.  The Mathematical Theory of Communication , 1950 .

[42]  Alexei Makarenko,et al.  Decentralised Data Fusion And Control In Active Sensor Networks , 2004 .

[43]  Ronald A. Howard,et al.  Influence Diagrams , 2005, Decis. Anal..

[44]  Kok Lay Teo,et al.  Control parametrization: A unified approach to optimal control problems with general constraints , 1988, Autom..

[45]  John N. Tsitsiklis,et al.  On the complexity of decentralized decision making and detection problems , 1985 .

[46]  Benjamin Van Roy,et al.  An approximate dynamic programming approach to decentralized control of stochastic systems , 2006 .

[47]  Alexei Makarenko,et al.  Scalable Control of Decentralised Sensor Platforms , 2003, IPSN.

[48]  Eric Nettleton,et al.  Decentralised Architectures for tracking and navigation with multiple flight vehicles , 2003 .

[49]  David S Alberts,et al.  Network Centric Warfare: Developing and Leveraging Information Superiority , 1999 .

[50]  James Manyika,et al.  An information-theoretic approach to data fusion and sensor management , 1993 .

[51]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[52]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[53]  Christos G. Cassandras,et al.  A Cooperative receding horizon controller for multivehicle uncertain environments , 2006, IEEE Transactions on Automatic Control.

[54]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[55]  Stephen P. Boyd,et al.  A scheme for robust distributed sensor fusion based on average consensus , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..

[56]  Shlomo Zilberstein,et al.  Bounded Policy Iteration for Decentralized POMDPs , 2005, IJCAI.

[57]  C. Tomlin,et al.  Decentralized optimization, with application to multiple aircraft coordination , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..

[58]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[59]  Hugh F. Durrant-Whyte,et al.  Dynamic space reconfiguration for Bayesian search and tracking with moving targets , 2008, Auton. Robots.

[60]  Alexei Makarenko,et al.  Implementation of an Indoor Active Sensor Network , 2004, ISER.

[61]  J. Tsitsiklis,et al.  Intractable problems in control theory , 1986 .

[62]  D. Bernoulli Exposition of a New Theory on the Measurement of Risk , 1954 .

[63]  Steffen L. Lauritzen,et al.  Representing and Solving Decision Problems with Limited Information , 2001, Manag. Sci..

[64]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[65]  S. F. Schmidt,et al.  The Kalman filter - Its recognition and development for aerospace applications , 1981 .

[66]  L. Stone Theory of Optimal Search , 1975 .

[67]  Reza Olfati-Saber,et al.  Consensus and Cooperation in Networked Multi-Agent Systems , 2007, Proceedings of the IEEE.

[68]  H. Witsenhausen A Counterexample in Stochastic Optimum Control , 1968 .

[69]  J. Neumann,et al.  Theory of games and economic behavior , 1945, 100 Years of Math Milestones.

[70]  Anthony Vodacek,et al.  Autonomous field-deployable wildland fire sensors , 2003 .

[71]  John N. Tsitsiklis,et al.  Some aspects of parallel and distributed iterative algorithms - A survey, , 1991, Autom..

[72]  Shlomo Zilberstein,et al.  Solving POMDPs using quadratically constrained linear programs , 2006, AAMAS '06.

[73]  Hugh F. Durrant-Whyte,et al.  The element-based method - theory and its application to bayesian search and tracking - , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[74]  Eduardo Camponogara,et al.  Distributed model predictive control , 2002 .

[75]  Reza Olfati-Saber,et al.  Distributed Kalman filtering for sensor networks , 2007, 2007 46th IEEE Conference on Decision and Control.

[76]  R. Bellman A Markovian Decision Process , 1957 .

[77]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[78]  Panos J. Antsaklis,et al.  Wireless Sensor Networks for Structural Health Monitoring: A Multi-Scale Approach , 2006 .

[79]  Hugh F. Durrant-Whyte,et al.  Optimal Search for a Lost Target in a Bayesian World , 2003, FSR.

[80]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[81]  Hugh F. Durrant-Whyte,et al.  Coordinated search for a lost target in a Bayesian world , 2004, Adv. Robotics.

[82]  S. Grime,et al.  Data fusion in decentralized sensor networks , 1994 .

[83]  Fouad A. Tobagi,et al.  Multiaccess Protocols in Packet Communication Systems , 1980, IEEE Trans. Commun..

[84]  Alexei Makarenko,et al.  Parametric POMDPs for planning in continuous state spaces , 2006, Robotics Auton. Syst..

[85]  Alex Brooks,et al.  An Indoor Experiment in Decentralized Coordinated Search , 2004, ISER.

[86]  Milind Tambe,et al.  The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..