Characterizing the synchronization behavior of parallel programs

Contention for synchronization locks and delays waiting for synchronization events can substantially increase the running time of a parallel program. This makes it important to characterize the synchronization behavior of programs and to provide analysis tools to aid both the hardware and software designer in evaluating design alternatives. This paper describes a tracing facility that is incorporated into a synchronization package. This facility provides a portable means to accurately and efficiently characterize parallel programs. The behavior of several applications has been monitored uncovering program characteristics that make it difficult to achieve linear speedup. Our monitoring facility allows a programmer to determine the performance implications of the synchronization structure he has used, and it allows the architect to evaluate various hardware support mechanisms.

[1]  Herb Schwetman,et al.  Monit: a performance monitoring tool for parallel and pseudo-parallel programs , 1987, SIGMETRICS '87.

[2]  E. L. Lusk,et al.  Use of monitors in FORTRAN: a tutorial on the barrier, self-scheduling DO-loop, and askfor monitors , 1985 .

[3]  Thomas A. Cargill,et al.  Cheap hardware support for software debugging and profiling , 1987, ASPLOS.

[4]  Tom Blank,et al.  Parallel logic simulation on general purpose machines , 1988, 25th ACM/IEEE, Design Automation Conference.Proceedings 1988..

[5]  Douglas W. Clark,et al.  Performance of the VAX-11/780 translation buffer: simulation and measurement , 1985, TOCS.

[6]  Anoop Gupta,et al.  Memory-reference characteristics of multiprocessor applications under MACH , 1988, SIGMETRICS '88.

[7]  T. J. Bergendahl,et al.  DIGITAL EQUIPMENT CORPORATION. , 1968, Analytical chemistry.

[8]  R. L. Sites,et al.  ATUM: a new technique for capturing address traces using microcode , 1986, ISCA '86.

[9]  Kevin P. McAuliffe,et al.  The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture , 1985, ICPP.

[10]  Frederica Darema,et al.  Memory access patterns of parallel scientific programs , 1987, SIGMETRICS '87.

[11]  Allan Gottlieb Avoiding Serial Bottlenecks in Ultraparallel MIMD Computers , 1984, COMPCON.

[12]  James M. Boyle,et al.  Beyond "'Speedup": Performance Analysis of Parallel Programs , 1987 .

[13]  Shreekant S. Thakkar,et al.  VLSI assist for a multiprocessor , 1987, ASPLOS.

[14]  Alan Jay Smith Cache Evaluation and the Impact of Workload Choice , 1985, ISCA.

[15]  Burton J. Smith Architecture And Applications Of The HEP Multiprocessor Computer System , 1982, Optics & Photonics.

[16]  Zhiyuan Li,et al.  A technique for reducing synchronization overhead in large scale multiprocessors , 1985, ISCA '85.

[17]  Samuel H. Fuller,et al.  The C.mmp Multiprocessor , 1978 .

[18]  Lawrence C. Stewart,et al.  Firefly: a multiprocessor workstation , 1987, IEEE Trans. Computers.