Visualization of Affect-Relations of Message Races for Debugging MPI Programs

Detecting unaffected races is important for debugging MPI parallel programs, because unaffected races can cause the occurrence of affected races which do not need to be debugged. However, the previous techniques can not discern unaffected races from affected races so that programmers will be easily overwhelmed by the vast information of race detection. In this paper, we present a new visualization which lets programmers know which race is affected or not. For this, our technique checks whether any message racing toward a race is affected or not based on happen- before relation, and also checks which process influences a race during an execution. After the execution, it visualizes the affect-relations of the detected races. Therefore, our visualization helps for programmers to effectively distinguish unaffected races from affected races, and to debug MPI parallel programs.

[1]  Michael M. Resch,et al.  MPI Application Development Using the Analysis Tool MARMOT , 2004, International Conference on Computational Science.

[2]  Robert Cypher,et al.  The semantics of blocking and nonblocking send and receive primitives , 1994, Proceedings of 8th International Parallel Processing Symposium.

[3]  Yong-Kee Jun,et al.  Detecting Unaffected Race Conditions in Message-Passing Programs , 2004, PVM/MPI.

[4]  Maria Beatriz Carmo,et al.  MPVisualizer: A General Tool to Debug Message Passing Parallel Applications , 1999, HPCN Europe.

[5]  Dieter Kranzlmuller,et al.  Event Graph Analysis for Debugging Massively Parallel Programs , 2000 .

[6]  William Gropp,et al.  User's Guide for mpich, a Portable Implementation of MPI Version 1.2.2 , 1996 .

[7]  Yu Lei,et al.  Efficient reachability testing of asynchronous message-passing programs , 2002, Eighth IEEE International Conference on Engineering of Complex Computer Systems, 2002. Proceedings..

[8]  Yong-Kee Jun,et al.  Detecting Unaffected Message Races in Parallel Programs , 2006, GPC.

[9]  Vijay K. Garg,et al.  Debugging distributed programs using controlled re-execution , 2000, PODC '00.

[10]  Michael M. Resch,et al.  MARMOT: An MPI Analysis and Checking Tool , 2003, PARCO.

[11]  K. C. Tai Race analysis of traces of asynchronous message-passing programs , 1997, Proceedings of 17th International Conference on Distributed Computing Systems.

[12]  Maria Beatriz Carmo,et al.  Monitoring and debugging message passing applications with MPVisualizer , 2000, Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing.

[13]  Joan M. Francioni,et al.  Testing races in parallel programs with an OtOt strategy , 1994, ISSTA '94.

[14]  Dieter Kranzlmüller,et al.  A Brief Overview of the MAD Debugging Activities , 2000, AADEBUG.

[15]  Robert H. B. Netzer,et al.  Debugging race conditions in message-passing programs , 1996, SPDT '96.