A Tool for Detecting Message Races in MPI Parallel Programs

ABSTRACT Message races should be detected for debugging effectively message-passing programs because they can cause non-deterministic executions of a program. Previous tools for detecting message races report that message races occur in every receive operation which is expected to receive any messages. However message races might not occur in the receive operation if each of messages is transmitted through a different logical communication channel so that their incorrect detection makes it a difficult task for programmers to debug programs. In this paper we suggest a tool, MPIRace-Check, which can exactly detect message races by checking the concurrency between send/receive operations, and by inspecting the logical communication channels of the messages. To detect message races, this tool uses the vector timestamp to check if send and receive operations are concurrent during an execution of a program and it also uses the message envelop to inspect if the logical communication channels of transmitted messages are the same. In our experiment, we show that our tool can exactly detect message races with efficiency using MPI_RTED and a benchmark program. By detecting message races exactly, therefore, our tool enables programmers to develop reliable parallel programs reducing the burden of debugging. Keyword:Message-passing Programs, Message Races, Debugging, MPIRace-Check