Thread‐ and Process‐based Implementations of the pSystem Parallel Programming Environment

Run-time work distribution in parallel programming systems is usually accomplished through the use of dynamic scheduling heuristics. Their sensitivity to run-time information such as global work-load, task granularity, data dependencies, locality of information, among others, is essential when trying to optimize performance. Adaptive schedulers that base their decisions on feed-back from the system are therefore of special importance. We have developed and used a general purpose parallel programming system, the pSystem, that also served as a test-bed environment on which we have experimented and studied the performance of distinct scheduling heuristics. Currently, we have two versions of the system: one based on Unix processes; and the other on Solaris threads. Threads (particularly user-level threads) are usually associated with low execution overheads, since they require minimal interaction with the operating system kernel. This suggests that lower grain parallelism may be more effectively exploited with a thread-based parallel programming system. Performance analysis of both implementations over a set of well known benchmarks, with various schedulers, shows that threads scale better under higher system loads and/or when the granularity of the tasks being executed is below a given threshold value. This paper starts with a description of the design and implementation of the pSystem computational model, followed by a detailed description of several experiments and the analysis of their results. 1997 John Wiley & Sons, Ltd.

[1]  David R. Keppel,et al.  Tools and Techniques for Building Fast Portable Threads Packages , 1993 .

[2]  Nicholas Carriero,et al.  Adaptive Parallelism and Piranha , 1995, Computer.

[3]  Robert D. Blumofe,et al.  Scheduling large-scale parallel computations on networks of workstations , 1994, Proceedings of 3rd IEEE International Symposium on High Performance Distributed Computing.

[4]  Steve R. Kleiman,et al.  SunOS Multi-thread Architecture , 1991, USENIX Winter.

[5]  Vaidy S. Sunderam,et al.  PVM: A Framework for Parallel Distributed Computing , 1990, Concurr. Pract. Exp..

[6]  Ewing Lusk,et al.  User''s Guide to the p4 Parallel Programming System , 1992 .

[7]  Ian T. Foster,et al.  Productive Parallel Programming: The PCN Approach , 1995, Sci. Program..

[8]  Nicholas Carriero,et al.  How to write parallel programs - a first course , 1990 .

[9]  Edward D. Lazowska,et al.  Adaptive load sharing in homogeneous distributed systems , 1986, IEEE Transactions on Software Engineering.

[10]  Fernando M. A. Silva,et al.  Scheduling Algorithms Performance with the pSystem Parallel Programming Environment , 1994, PARLE.

[11]  Monica S. Lam,et al.  Coarse-grain parallel programming in Jade , 1991, PPOPP '91.

[12]  Phillip Krueger,et al.  Two adaptive location policies for global scheduling algorithms , 1990, Proceedings.,10th International Conference on Distributed Computing Systems.

[13]  Michael B. Jones,et al.  Mach: a system software kernel , 1989, Digest of Papers. COMPCON Spring 89. Thirty-Fourth IEEE Computer Society International Conference: Intellectual Leverage.

[14]  Thomas E. Anderson,et al.  The performance implications of thread management alternatives for shared-memory multiprocessors , 1989, SIGMETRICS '89.

[15]  Andrew S. Grimshaw,et al.  Easy-to-use object-oriented parallel processing with Mentat , 1993, Computer.

[16]  Thomas L. Casavant,et al.  A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems , 1988, IEEE Trans. Software Eng..

[17]  Evangelos P. Markatos,et al.  Shared memory vs. message passing in shared-memory multiprocessors , 1992, [1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing.