Grace: safe multithreaded programming for C/C++

The shift from single to multiple core architectures means that programmers must write concurrent, multithreaded programs in order to increase application performance. Unfortunately, multithreaded applications are susceptible to numerous errors, including deadlocks, race conditions, atomicity violations, and order violations. These errors are notoriously difficult for programmers to debug. This paper presents Grace, a software-only runtime system that eliminates concurrency errors for a class of multithreaded programs: those based on fork-join parallelism. By turning threads into processes, leveraging virtual memory protection, and imposing a sequential commit protocol, Grace provides programmers with the appearance of deterministic, sequential execution, while taking advantage of available processing cores to run code concurrently and efficiently. Experimental results demonstrate Grace's effectiveness: with modest code changes across a suite of computationally-intensive benchmarks (1-16 lines), Grace can achieve high scalability and performance while preventing concurrency errors.

[1]  Luis Ceze,et al.  Implicit parallelism with ordered transactions , 2007, PPoPP.

[2]  Brandon Lucia,et al.  Atom-Aid: Detecting and Surviving Atomicity Violations , 2009, IEEE Micro.

[3]  Alan L. Cox,et al.  TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems , 1994, USENIX Winter.

[4]  Milo M. K. Martin,et al.  Deconstructing transactions: The subtleties of atomicity , 2005 .

[5]  James W. Havender Avoiding Deadlock in Multitasking Systems , 1968, IBM Syst. J..

[6]  Marek Olszewski,et al.  Kendo: efficient deterministic multithreading in software , 2009, ASPLOS.

[7]  Sriram K. Rajamani,et al.  ISOLATOR: dynamically ensuring isolation in comcurrent programs , 2009, ASPLOS.

[8]  Kai Huang,et al.  Data-Race Detection in Transactions- Everywhere Parallel Programming , 2003 .

[9]  Keir Fraser,et al.  Language support for lightweight transactions , 2014, SIGP.

[10]  Doug Lea,et al.  A Java fork/join framework , 2000, JAVA '00.

[11]  Nir Shavit,et al.  Software transactional memory , 1995, PODC '95.

[12]  Martín Abadi,et al.  Semantics of transactional memory and automatic mutual exclusion , 2011, TOPL.

[13]  Charles E. Leiserson,et al.  Efficient Detection of Determinacy Races in Cilk Programs , 1997, SPAA '97.

[14]  Dan Grossman,et al.  Enforcing isolation and ordering in STM , 2007, PLDI '07.

[15]  Charles E. Leiserson,et al.  Detecting data races in Cilk programs that use locks , 1998, SPAA '98.

[16]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[17]  Jeffrey Overbey,et al.  A type and effect system for deterministic parallel Java , 2009, OOPSLA 2009.

[18]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[19]  James R. Larus,et al.  Transactional Memory , 2006, Transactional Memory.

[20]  Kathryn S. McKinley,et al.  Composing high-performance memory allocators , 2001, PLDI '01.

[21]  Cormac Flanagan,et al.  A type and effect system for atomicity , 2003, PLDI.

[22]  Adam Welc,et al.  Irrevocable transactions and their applications , 2008, SPAA '08.

[23]  Kathryn S. McKinley,et al.  Hoard: a scalable memory allocator for multithreaded applications , 2000, SIGP.

[24]  Willy Zwaenepoel,et al.  Implementation and performance of Munin , 1991, SOSP '91.

[25]  Seth Copen Goldstein,et al.  Hardware-assisted replay of multiprocessor programs , 1991, PADD '91.

[26]  Suresh Jagannathan,et al.  Safe futures for Java , 2005, OOPSLA '05.

[27]  Chen Ding,et al.  Software behavior oriented parallelization , 2007, PLDI '07.

[28]  Christoforos E. Kozyrakis,et al.  Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[29]  Brandon Lucia,et al.  DMP: deterministic shared memory multiprocessing , 2009, IEEE Micro.

[30]  Adam Welc,et al.  Design and implementation of transactional constructs for C/C++ , 2008, OOPSLA '08.

[31]  Xiao Ma,et al.  MUVI: automatically inferring multi-variable access correlations and detecting related semantic and concurrency bugs , 2007, SOSP.

[32]  Nir Shavit,et al.  Transactional Locking II , 2006, DISC.

[33]  Charles E. McDowell,et al.  Debugging concurrent programs , 1989, ACM Comput. Surv..

[34]  Barton P. Miller,et al.  What are race conditions?: Some issues and formalizations , 1992, LOPL.

[35]  Michael Isard,et al.  Automatic Mutual Exclusion , 2007, HotOS.

[36]  Ali-Reza Adl-Tabatabai,et al.  McRT-Malloc: a scalable transactional memory allocator , 2006, ISMM '06.

[37]  Yuanyuan Zhou,et al.  Learning from mistakes: a comprehensive study on real world concurrency bug characteristics , 2008, ASPLOS.

[38]  James Reinders,et al.  Intel threading building blocks - outfitting C++ for multi-core processor parallelism , 2007 .