Finding race conditions in Erlang with QuickCheck and PULSE

We address the problem of testing and debugging concurrent, distributed Erlang applications. In concurrent programs, race conditions are a common class of bugs and are very hard to find in practice. Traditional unit testing is normally unable to help finding all race conditions, because their occurrence depends so much on timing. Therefore, race conditions are often found during system testing, where due to the vast amount of code under test, it is often hard to diagnose the error resulting from race conditions. We present three tools (QuickCheck, PULSE, and a visualizer) that in combination can be used to test and debug concurrent programs in unit testing with a much better possibility of detecting race conditions. We evaluate our method on an industrial concurrent case study and illustrate how we find and analyze the race conditions.

[1]  Mats Cronqvist Troubleshooting a large erlang system , 2004, ERLANG '04.

[2]  Cyrille Artho,et al.  Visualization of Concurrent Program Executions , 2007, 31st Annual International Computer Software and Applications Conference (COMPSAC 2007).

[3]  Lars-Åke Fredlund,et al.  A more accurate semantics for distributed erlang , 2007, ERLANG '07.

[4]  Koen Claessen,et al.  QuickCheck: a lightweight tool for random testing of Haskell programs , 2011, SIGP.

[5]  Emden R. Gansner,et al.  An open graph visualization system and its applications to software engineering , 2000, Softw. Pract. Exp..

[6]  John Hughes,et al.  Testing telecoms software with quviq QuickCheck , 2006, ERLANG '06.

[7]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[8]  Joe Armstrong,et al.  Programming Erlang: Software for a Concurrent World , 1993 .

[9]  John T. Stasko,et al.  Visualizing Interactions in Program Executions , 1997, Proceedings of the (19th) International Conference on Software Engineering.

[10]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[11]  Yuanyuan Zhou,et al.  Learning from mistakes: a comprehensive study on real world concurrency bug characteristics , 2008, ASPLOS.

[12]  Koen Claessen,et al.  QuickCheck: a lightweight tool for random testing of Haskell programs , 2000, ICFP.

[13]  John T. Stasko,et al.  Integrating visualization support into distributed computing systems , 1995, Proceedings of 15th International Conference on Distributed Computing Systems.

[14]  Ulf T. Wiger Extended process registry for erlang , 2007, ERLANG '07.

[15]  Hsueh-I Lu,et al.  Detecting Race Conditions in Parallel Programs that Use Semaphores , 2002, Algorithmica.

[16]  Barton P. Miller,et al.  On the Complexity of Event Ordering for Shared-Memory Parallel Program Executions , 1990, ICPP.

[17]  Thomas Ball,et al.  Finding and Reproducing Heisenbugs in Concurrent Programs , 2008, OSDI.

[18]  Maurice Herlihy,et al.  Axioms for concurrent objects , 1987, POPL '87.

[19]  John Hughes,et al.  QuickCheck Testing for Fun and Profit , 2007, PADL.

[20]  David Harel,et al.  Towards Trace Visualization and Exploration for Reactive Systems , 2007, IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC 2007).

[21]  Koushik Sen,et al.  Randomized active atomicity violation detection in concurrent programs , 2008, SIGSOFT '08/FSE-16.

[22]  Koushik Sen,et al.  Race directed random testing of concurrent programs , 2008, PLDI '08.

[23]  Emden R. Gansner,et al.  An open graph visualization system and its applications to software engineering , 2000 .

[24]  Lars-Åke Fredlund,et al.  Trace analysis of Erlang programs , 2002, ERLANG '02.

[25]  Lars-Åke Fredlund,et al.  McErlang: a model checker for a distributed functional programming language , 2007, ICFP '07.