Tuning lock-based multicore program based on sliding windows to tolerate data race

Because in-house debugging and test are difficult to discover all potential data races in multicore programs, it is necessary and significant to tolerate the potential data races in the production-run phase to secure the correct execution. However, the existing tolerating methods are limited to some kinds of data races. This paper proposes a new data-race tolerating approach, which can detect and adjust the data races whether it is in the protection of critical section or lack of protection to improve the correctness of multicore programs. It uses sliding windows to accommodate the memory instructions in critical section or recent memory instructions lack of protection and detects the potential data races which are more likely to cause errors. Then, by delaying the critical reversion points, data races are adjusted to reduce the probability of software failure. To implement the tolerating approach, the current multicore processor need not change its original cache coherence protocol and just adds very little hardware. Simulation results show that it brings low hardware, low bandwidth overhead, and negligible slowdown.

[1]  Martín Abadi,et al.  Transactional memory with strong atomicity using off-the-shelf memory protection hardware , 2009, PPoPP '09.

[2]  Josep Torrellas,et al.  Pacman: Tolerating asymmetric data races with unintrusive hardware , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[3]  Darko Kirovski,et al.  Hardware support for enforcing isolation in lock-based parallel programs , 2012, ICS '12.

[4]  Shan Lu,et al.  ConAir: featherweight concurrency bug recovery via single-threaded idempotent execution , 2013, ASPLOS '13.

[5]  João Lourenço,et al.  A Hardware Approach to Detect, Expose and Tolerate High Level Data Races , 2016, 2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP).

[6]  Barton P. Miller,et al.  What are race conditions?: Some issues and formalizations , 1992, LOPL.

[7]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[8]  Brandon Lucia,et al.  ColorSafe: architectural support for debugging and dynamically avoiding multi-variable atomicity violations , 2010, ISCA.

[9]  Josep Torrellas,et al.  Dynamically detecting and tolerating IF-Condition Data Races , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[10]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[11]  Satish Narayanasamy,et al.  Tolerating Concurrency Bugs Using Transactions as Lifeguards , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[12]  Sriram K. Rajamani,et al.  ISOLATOR: dynamically ensuring isolation in comcurrent programs , 2009, ASPLOS.

[13]  CezeLuis,et al.  Cooperative empirical failure avoidance for multithreaded programs , 2013 .

[14]  Brandon Lucia,et al.  Cooperative empirical failure avoidance for multithreaded programs , 2013, ASPLOS '13.

[15]  Michael Burrows,et al.  Eraser: a dynamic data race detector for multithreaded programs , 1997, TOCS.

[16]  James H. Patterson,et al.  Portable Programs for Parallel Processors , 1987 .

[17]  João Lourenço Hardware Approach for Detecting, Exposing and Tolerating High Level Atomicity Violations , 2014 .

[18]  Wei Zhang,et al.  Automated Concurrency-Bug Fixing , 2012, OSDI.

[19]  Tomás Vojnar,et al.  Healing data races on-the-fly , 2007, PADTAD '07.

[20]  Yong-Kee Jun,et al.  EventHealer: Bypassing data races in event-driven programs , 2016, J. Syst. Softw..

[21]  Charles Zhang,et al.  Grail: context-aware fixing of concurrency bugs , 2014, SIGSOFT FSE.

[22]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[23]  Josep Torrellas,et al.  SigRace: signature-based data race detection , 2009, ISCA '09.

[24]  Brandon Lucia,et al.  Atom-Aid: Detecting and Surviving Atomicity Violations , 2009, IEEE Micro.

[25]  Junfeng Yang,et al.  Bypassing Races in Live Applications with Execution Filters , 2010, OSDI.

[26]  Darko Kirovski,et al.  Efficient Runtime Detection and Toleration of Asymmetric Races , 2012, IEEE Transactions on Computers.