Workstation hardware and software support for parallel applications

Abstract High-powered RISC microprocessors and their system software, e.g. compilers and operating systems, have primarily been optimised for use as quasi-autonomous workstations. However, as workstations become more common and the ratio of workstations to workers grows, collections of workstation-class microprocessors are increasingly being viewed as cost-effective vehicles for speeding up applications. It is therefore necessary to re-evaluate current workstation systems, and in particular, to identify what features should be included or omitted to allow their use as components of parallel machines. In this paper, we consider this assignment of labor to workstation system hardware and software, and make the case for vamping up the layer of system software, run-time supervisors, that manage parallel applications.

[1]  Larry Rudolph,et al.  The power of parallel prefix , 1985, IEEE Transactions on Computers.

[2]  Robert E. McGrath,et al.  Using Memory in the Cedar System , 1988, ICS.

[3]  Edith Schonberg,et al.  Factoring: a practical and robust method for scheduling parallel loops , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[4]  William J. Bolosky,et al.  Mach: A New Kernel Foundation for UNIX Development , 1986, USENIX Summer.

[5]  H. T. Kung,et al.  Supporting systolic and memory communication in iWarp , 1990, ISCA '90.

[6]  Henry G. Dietz,et al.  Unified management of registers and cache using liveness and cache bypass , 1989, PLDI '89.

[7]  James B. Morris,et al.  Ada for the Intel 432 Microcomputer , 1981, Computer.

[8]  Kenneth M. Kempner Computers in Cardiology , 1975, Computer.

[9]  Denis A. Nicole,et al.  Reconfigurable transputer processor architectures , 1989, [1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track.

[10]  Michael Stumm,et al.  Algorithms implementing distributed shared memory , 1990, Computer.

[11]  Robbert van Renesse,et al.  Amoeba A Distributed Operating System for the 1990 s Sape , 1990 .

[12]  Allan Gottlieb,et al.  Highly parallel computing , 1989, Benjamin/Cummings Series in computer science and engineering.

[13]  Marvin V. Zelkowitz,et al.  Programming Languages: Design and Implementation , 1975 .

[14]  Sanjay Jain,et al.  Crowd Control: Coordinating Processes in Parallel , 1987, ICPP.

[15]  William J. Dally,et al.  A VLSI Architecture for Concurrent Data Structures , 1987 .

[16]  Paul M. B. Vitányi Locality, Communication, and Interconnect Length in Multicomputers , 1988, SIAM J. Comput..

[17]  Dennis Shasha,et al.  Efficient and correct execution of parallel programs that share memory , 1988, TOPL.

[18]  Charles L. Seitz,et al.  Multicomputers: message-passing concurrent computers , 1988, Computer.

[19]  Abhiram G. Ranade,et al.  How to emulate shared memory (Preliminary Version) , 1987, FOCS.

[20]  Kevin P. McAuliffe,et al.  Automatic Management of Programmable Caches , 1988, ICPP.

[21]  Anne Rogers,et al.  Process decomposition through locality of reference , 1989, PLDI '89.

[22]  Burton J. Smith Architecture And Applications Of The HEP Multiprocessor Computer System , 1982, Optics & Photonics.

[23]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[24]  Edith Schonberg,et al.  Low-overhead scheduling of nested parallelism , 1991, IBM J. Res. Dev..

[25]  Michael J. Quinn,et al.  Designing Efficient Algorithms for Parallel Computers , 1987 .

[26]  Thomas S. Huang,et al.  Image processing , 1971 .

[27]  Andrew S. Tanenbaum,et al.  Structured Computer Organization , 1976 .

[28]  Randall Rettberg,et al.  The Monarch parallel processor hardware design , 1990, Computer.

[29]  Per Brinch Hansen,et al.  The nucleus of a multiprogramming system , 1970, CACM.

[30]  Brian N. Bershad,et al.  Scheduler activations: effective kernel support for the user-level management of parallelism , 1991, TOCS.

[31]  Willy Zwaenepoel,et al.  Adaptive software cache management for distributed shared memory architectures , 1990, ISCA '90.

[32]  Brian N. Bershad,et al.  User-level interprocess communication for shared memory multiprocessors , 1991, TOCS.

[33]  Michel Dubois,et al.  Scalable shared-memory multiprocessor architectures , 1990, Computer.

[34]  Edith Schonberg,et al.  Highly parallel Ada—Ada on an ultracomputer , 1985 .

[35]  L. W. Tucker,et al.  Architecture and applications of the Connection Machine , 1988, Computer.

[36]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[37]  Leslie G. Valiant,et al.  A Scheme for Fast Parallel Communication , 1982, SIAM J. Comput..

[38]  Douglas N. Kimelman,et al.  The RP3 program visualization environment , 1991, IBM J. Res. Dev..

[39]  Nian-Feng Tzeng,et al.  Distributing Hot-Spot Addressing in Large-Scale Multiprocessors , 1987, IEEE Transactions on Computers.

[40]  J. Zahorjan,et al.  Introducing memory into the switch elements of multiprocessor interconnection networks , 1989, ISCA '89.

[41]  Mary K. Vernon,et al.  Hardware Support for Interprocess Communication , 1990, IEEE Trans. Parallel Distributed Syst..

[42]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[43]  Edith Schonberg,et al.  Experience with program visualization in tuning parallel loop scheduling , 1992, Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences.