A Design Framework for Highly Concurrent Systems

Building highly concurrent systems, such as large-scale Internet services, requires managing many information flows at once and maintaining peak throughput when demand exceeds resource availability. In addition, any platform supporting Internet services must provide high availability and be able to cope with burstiness of load. Many approaches to building concurrent systems have been proposed, which generally fall into the two categories of threaded and event-driven programming. We propose that threads and events are actually on the ends of a design spectrum, and that the best implementation strategy for these applications is somewhere in between. We present a general-purpose design framework for building highly concurrent systems, based on three design components -- tasks, queues, and thread pools -- which encapsulate the concurrency, performance, fault isolation, and software engineering benefits of both threads and events. We present a set of design patterns that can be applied to map an application onto an implementation using these components. In addition, we provide an analysis of several systems (including an Internet services platform and a highly available, distributed, persistent data store) constructed using our framework, demonstrating its benefit for building and reasoning about concurrent applications.

[1]  Robbert van Renesse,et al.  Horus: a flexible group communication system , 1996, CACM.

[2]  Alan L. Cox,et al.  TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems , 1994, USENIX Winter.

[3]  Tad Hogg,et al.  Spawn: A Distributed Computational Economy , 1992, IEEE Trans. Software Eng..

[4]  Peter B. Danzig,et al.  A Hierarchical Internet Object Cache , 1996, USENIX ATC.

[5]  Eric A. Brewer,et al.  System support for scalable and fault tolerant Internet services , 1999, Distributed Syst. Eng..

[6]  Dawson R. Engler,et al.  ASHs: Application-Specific Handlers for High-Performance Messaging , 1996, SIGCOMM.

[7]  Michael Stonebraker,et al.  An economic paradigm for query processing and data migration in Mariposa , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[8]  Peter Druschel,et al.  A Scalable and Explicit Event Delivery Mechanism for UNIX , 1999, USENIX Annual Technical Conference, General Track.

[9]  David E. Culler,et al.  Scalable, distributed data structures for internet service construction , 2000, OSDI.

[10]  David E. Culler,et al.  Distributed data structures for internet service construction , 2000, USENIX Symposium on Operating Systems Design and Implementation.

[11]  Guy L. Steele,et al.  The Java Language Specification , 1996 .

[12]  Douglas C. Schmidt,et al.  APPLYING THE PROACTOR PATTERN TO HIGH-PERFORMANCE WEB SERVERS , 1998 .

[13]  Willy Zwaenepoel,et al.  Flash: An efficient and portable Web server , 1999, USENIX Annual Technical Conference, General Track.

[14]  Eric A. Brewer,et al.  Cluster-based scalable network services , 1997, SOSP.

[15]  David A. Wagner,et al.  The Ninja Jukebox , 1999, USENIX Symposium on Internet Technologies and Systems.

[16]  Brian N. Bershad,et al.  Scheduler activations: effective kernel support for the user-level management of parallelism , 1991, TOCS.

[17]  David E. Culler,et al.  The multispace: an evolutionary platform for infrastructural services , 1999 .

[18]  Eddie Kohler,et al.  The Click modular router , 1999, SOSP.

[19]  Douglas C. Schmidt,et al.  High performance web servers on windows NT design and performance , 1997 .

[20]  Nikita Borisov,et al.  Querying Large Collections of Music for Similarity , 2000 .

[21]  Louise E. Moser,et al.  Extended virtual synchrony , 1994, 14th International Conference on Distributed Computing Systems.

[22]  Dawson R. Engler,et al.  Server operating systems , 1996, EW 7.