Understanding complex, real-world systems through asynchronous, distributed decision-making algorithms

Abstract Traditionally, the underlying decision-making algorithms for most real-world systems have been centralized. The term, real-world, refers to systems under computer control that relate to everyday life, are beneficial to the society in the large, and are generally large-scale in scope. Examples include AT&T's dynamic non-hierarchical routing (DNHR) for routing telephone calls, the North American advanced train control system (ATCS) for routing railways, the Swiss banking system (SIC), and inventory management algorithms. While centralized algorithms are simple, easy to conceive and implement, they execute sequentially on uniprocessors and are slow. In addition, by their very nature, centralized algorithms are highly susceptible to natural and artificial disasters. Synchronous distributed algorithms constitute a performance improvement over centralized algorithms, and have been used in fault simulation within the discipline of computer-aided design of digital systems and in matrix manipulations. However, their performance is limited due to frequent inherent synchronizations. This paper critically examines the nature of large-scale, real-world systems and observes that, fundamentally, most complex systems are composed of entities – concurrent, independent, and self-contained units of decision-making, that interact with each other, asynchronously. This paper presents a new class of algorithms – asynchronous, distributed, decision-making (ADDM) algorithms, to constitute the underlying control of such systems. While ADDM algorithms are closely related to autonomous decentralized systems (ADS) in the principal elements, their characteristics and boundaries are defined rigorously. While ADDM algorithms are difficult to conceive, design, and implement, they constitute the natural and logical choice for systems control, and hold the promise of extracting the maximal parallelism inherent in these systems. In addition, in principle, true asynchronous systems can be described accurately only by asynchronous, distributed algorithms, never by synchronous distributed algorithms. This paper reasons the nature of most complex real-world systems from first principles and reasons for its increasing importance in the design of future, large-scale, systems. It then presents the underlying principle of ADDM algorithms, details their fundamental characteristics, enumerates a number of successful ADDM algorithms for problems from different disciplines, and briefly reviews the nature of three of them – (1) real-time, domestic payments processing system, (2) distributed scheduling in railway networks, and (3) distributed routing in ATM networks.

[1]  D. P. Berteekas,et al.  Distributed Asynchronous Algorithms , 1988, Proceedings of the 1988 IEEE International Conference on Systems, Man, and Cybernetics.

[2]  Computer Staff Parallel processors were the future ... and may yet be , 1996 .

[3]  K. Mani Chandy,et al.  Asynchronous distributed simulation via a sequence of parallel computations , 1981, CACM.

[4]  Robert Rönngren,et al.  An incremental benchmark suite for performance tuning of parallel discrete event simulation , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[5]  Wolfgang Ertel,et al.  On the Definition of Speedup , 1994, PARLE.

[6]  Sumit Ghosh,et al.  A Distributed Approach to Real-Time Payments-Processing in a Partially-Connected Network of Banks: Modeling and Simulation , 1994, Simul..

[7]  Jean-Marc Vincent,et al.  Performance Evaluation of Parallel Systems - ALPES Environment , 1993, PARCO.

[8]  Orly Kremien,et al.  Scalability in distributed systems, parallel systems and supercomputers , 1995, HPCN Europe.

[9]  S. Ghosh,et al.  A framework for investigating security attacks in ATM networks , 1999, MILCOM 1999. IEEE Military Communications. Conference Proceedings (Cat. No.99CH36341).

[10]  S. Ghosh,et al.  Simulating asynchronous, decentralized military command and control , 1996 .

[11]  Sumit Ghosh,et al.  The concept of "stability" in asynchronous distributed decision-making systems , 2000, IEEE Trans. Syst. Man Cybern. Part B.

[12]  P.B. Key,et al.  Distributed dynamic routing schemes , 1990, IEEE Communications Magazine.

[13]  John N. Tsitsiklis,et al.  On the complexity of decentralized decision making and detection problems , 1985 .

[14]  Sumit Ghosh,et al.  Modeling and Simulation of a Hierarchical, Distributed, Dynamic Inventory Management Scheme , 1997, Simul..

[15]  John N. Tsitsiklis,et al.  On the average communication complexity of asynchronous distributed algorithms , 1995, JACM.

[16]  S. Ghosh,et al.  An asynchronous distributed discrete event simulation algorithm for cyclic circuits using a data-flow network , 1991, Conference Proceedings 1991 IEEE International Conference on Systems, Man, and Cybernetics.

[17]  M. Manwaring,et al.  An architecture for parallel interpretation: performance measurements , 1994, Proceedings of Twentieth Euromicro Conference. System Architecture and Integration.

[18]  S. Ghosh,et al.  RYNSORD: a novel decentralized algorithm for railway networks with "soft reservation" , 1998 .

[19]  S. Yamanouchi Essential Information Systems for Railways and Intensive Application of ADS Technology - COSMOS and ATOS , 1999, ISADS.

[20]  H. G. Rotithor Enhanced Bayesian decision model for decentralized decision making in a dynamic environment , 1991, Conference Proceedings 1991 IEEE International Conference on Systems, Man, and Cybernetics.

[21]  Sumit Ghosh,et al.  On the concept of "stability" in asynchronous distributed decision-making systems , 1999, Proceedings. Fourth International Symposium on Autonomous Decentralized Systems. - Integration of Heterogeneous Systems -.

[22]  R. Feynman The Character of Physical Law , 1965 .

[23]  Hasan B. Mutlu,et al.  Synchronous optical network and broadband ISDN protocols , 1989, Computer.

[24]  Dobrivoje Popovic,et al.  Performance evaluation of distributed, intelligent real-time control systems , 1994, Proceedings of 1994 American Control Conference - ACC '94.

[25]  Giuseppe Serazzi,et al.  Performance evaluation of parallel systems , 1999, Parallel Comput..

[26]  Richard S. Barr,et al.  On Reporting the Speedup of Parallel Algorithms: a Survey of Issues and Experts , 1992, Computer Science and Operations Research.

[27]  S. Ghosh,et al.  International payments processing in real time: A distributed architecture , 1994, IEEE Computational Science and Engineering.

[28]  Sumit Ghosh On the proof of correctness of "Yet another asynchronous distributed discrete event simulation algorithm (YADDES)" , 1996, IEEE Trans. Syst. Man Cybern. Part A.

[29]  Hirokazu Ihara,et al.  Autonomous Decentralized Computer Control Systems , 1984, Computer.

[30]  Yoshiteru Ishida,et al.  The immune system as a prototype of autonomous decentralized systems: an overview , 1997, Proceedings of the Third International Symposium on Autonomous Decentralized Systems. ISADS 97.

[31]  Robert L. Braddock,et al.  Operational performance metrics in a distributed system. Part II.: Metrics and interpretation , 1992, SAC '92.

[32]  Eric G. Manning,et al.  Distributed Simulation Using a Network of Processors , 1979, Comput. Networks.

[33]  Ramesh Subramonian,et al.  LogP: a practical model of parallel computation , 1996, CACM.

[34]  Vipin Kumar,et al.  Performance Properties of Large Scale Parallel Systems , 1993, J. Parallel Distributed Comput..

[35]  George Coulouris,et al.  Distributed systems - concepts and design , 1988 .

[36]  Tassos Markas,et al.  On distributed fault simulation , 1990, Computer.

[37]  Flaviu Cristian,et al.  The Timed Asynchronous Distributed System Model , 1999, IEEE Trans. Parallel Distributed Syst..

[38]  J. Lecuivre,et al.  A framework for validating distributed real time applications by performance evaluation of communication profiles , 1995, Proceedings 1995 IEEE International Workshop on Factory Communication Systems. WFCS'95.

[39]  Peter L. Reiher,et al.  Experiences in parallel performance measurement: the speedup bias , 1992 .

[40]  Sumit Ghosh,et al.  Dicaf: A Distributed Architecture for Intelligent Transportation , 1998, Computer.

[41]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[42]  Rakesh Kushwaha Methodology for Predicting Performance of Distributed and Parallel Systems , 1993, Perform. Evaluation.

[43]  Sumit Ghosh,et al.  A frame of reference for the performance evaluation of asynchronous, distributed decision-making algorithms , 2000, J. Syst. Softw..

[44]  Yves Arrouye Scope: An extensible interactive environment for the performance evaluation of parallel systems , 1996, Microprocess. Microprogramming.

[45]  Larry A. Dunning,et al.  Performance Comparison of Two Algorithms for Task Assignment , 1994, 1994 International Conference on Parallel Processing Vol. 3.

[46]  Sumit Ghosh,et al.  A novel algorithm for discrete-event simulation: asynchronous distributed discrete-event simulation algorithm for cyclic circuits using a dataflow network , 1991, Computer.

[47]  Vipin Kumar,et al.  Analyzing performance of large scale parallel systems , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[48]  R. V. Iyer,et al.  DARYN-a distributed decision-making algorithm for railway networks: modeling and simulation , 1995 .

[49]  Yang Wang,et al.  Performance evaluation of the networks of workstations for parallel processing applications , 1994, Proceedings of 26th Southeastern Symposium on System Theory.

[50]  Gerard Tel,et al.  Topics in distributed algorithms , 1991 .

[51]  Laura M. Haas,et al.  Distributed deadlock detection , 1983, TOCS.

[52]  John N. Tsitsiklis,et al.  Convergence and asymptotic agreement in distributed decision problems , 1982 .

[53]  Günter Haring,et al.  Performance measurement and visualization of parallel systems : proceedings of the Workshop on Performance Measurement and Visualization, Moravany, Czechoslovakia, 23-24 October 1992 , 1993 .

[54]  Jeff Magee,et al.  Rapid assessment of decentralized algorithms , 1990, COMPEURO'90: Proceedings of the 1990 IEEE International Conference on Computer Systems and Software Engineering@m_Systems Engineering Aspects of Complex Computerized Systems.

[55]  Tony S. Lee,et al.  Stability of RYNSORD, a decentralized algorithm for railway networks, under perturbations , 1999, Gateway to 21st Century Communications Village. VTC 1999-Fall. IEEE VTS 50th Vehicular Technology Conference (Cat. No.99CH36324).

[56]  Nancy A. Lynch,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[57]  Sumit Ghosh,et al.  A novel approach to asynchronous, decentralized decision-making in military command and control , 1995, Proceedings ISADS 95. Second International Symposium on Autonomous Decentralized Systems.

[58]  Evgenia Smirni,et al.  PerPreT - A Performance Prediction Tool for Massive Parallel Sysytems , 1995, MMB.

[59]  Paul G. Spirakis,et al.  BSP vs LogP , 1996, SPAA '96.

[60]  Sumit Ghosh,et al.  Modeling and distributed simulation of a broadband-ISDN network , 1993, Computer.