Concurrent Deadlock Detection In Parallel Programs

Abstract Many parallel programs have been developed that use message passing for communication. This leads to efficient and portable programs, but their complexity makes them hard to debug. One of the common problems in such programs is the detection of deadlocks. A deadlock detector, MPIDD, has been developed for dynamically detecting deadlocks in parallel programs that are written using C+ + and MPI. The detection code for most of the blocking and non-blocking point-to-point and collective routines has been implemented. The code has been tested against an extensive test suite, application programs, and some publicly available benchmarks. The detector takes advantage of the MPI's profiling layer, requires no significant modification of user's code, and incurs very little overhead when invoked. Portability of the detector code is also a key advantage.

[1]  Jack Dongarra,et al.  MPI: The Complete Reference , 1996 .

[2]  Robert H. B. Netzer Optimal tracing and replay for debugging shared-memory parallel programs , 1993, PADD '93.

[3]  James Coyle,et al.  Deadlock detection in MPI programs , 2002, Concurr. Comput. Pract. Exp..

[4]  Chris McDonald,et al.  Debugging parallel programs using incomplete information , 1999, ICWC 99. IEEE Computer Society International Workshop on Cluster Computing.

[5]  Michael Burrows,et al.  Eraser: a dynamic data race detector for multithreaded programs , 1997, TOCS.

[6]  Ralf H. Reussner,et al.  SKaMPI: A Detailed, Accurate MPI Benchmark , 1998, PVM/MPI.

[7]  Peter S. Pacheco Parallel programming with MPI , 1996 .

[8]  Kuo-Chung Tai,et al.  Deadlock analysis of synchronous message-passing programs , 1999, 1999 Proceedings International Symposium on Software Engineering for Parallel and Distributed Systems.

[9]  Bronis R. de Supinski,et al.  Dynamic Software Testing of MPI Applications with Umpire , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[10]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.