Unleashing concurrency for irregular data structures

To implement the atomicity in accessing the irregular data structure, developers often use the coarse-grained locking because the hierarchical nature of the data structure makes the reasoning of fine-grained locking difficult and error-prone for the update of an ancestor field in the data structure may affect its descendants. The coarse-grained locking disallows the concurrent accesses to the entire data structure and leads to a low degree of concurrency. We propose an approach, built upon the Multiple Granularity Lock (MGL), that replaces the coarse-grained locks to unleash more concurrency for irregular data structures. Our approach is widely applicable and does not require the data structures to have special shapes. We produce the MGL locks through reasoning about the hierarchy of the data structure and the accesses to it. According to the evaluation results on widely used applications, our optimization brings the significant speedup, e.g., at least 7%-20% speedup and up to 2X speedup.

[1]  Keshav Pingali,et al.  How much parallelism is there in irregular applications? , 2009, PPoPP '09.

[2]  Charles Zhang,et al.  Axis: Automatically fixing atomicity violations through solving control constraints , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[3]  Eran Yahav,et al.  Automatic fine-grain locking using shape properties , 2011, OOPSLA '11.

[4]  Eran Yahav,et al.  Concurrent libraries with foresight , 2013, PLDI.

[5]  Doug Lea,et al.  The java.util.concurrent synchronizer framework , 2005, Sci. Comput. Program..

[6]  Keshav Pingali,et al.  Structure-driven optimizations for amorphous data-parallel programs , 2010, PPoPP '10.

[7]  James R. Larus,et al.  SIMD parallelization of applications that traverse irregular data structures , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[8]  Timothy L. Harris,et al.  Lock Inference in the Presence of Large Libraries , 2012, ECOOP.

[9]  Jong-Deok Choi,et al.  Escape analysis for Java , 1999, OOPSLA '99.

[10]  Maurice Herlihy,et al.  A flexible framework for implementing software transactional memory , 2006, OOPSLA '06.

[11]  Maurice Herlihy,et al.  Transactional boosting: a methodology for highly-concurrent transactional objects , 2008, PPoPP.

[12]  Charles Zhang,et al.  Finding incorrect compositions of atomicity , 2013, ESEC/FSE 2013.

[13]  Kunle Olukotun,et al.  Efficient Parallel Graph Exploration on Multi-Core CPU and GPU , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[14]  Martin C. Rinard Parallel Synchronization-Free Approximate Data Structure Construction , 2013, HotPar.

[15]  Michael D. Ernst,et al.  Refactoring sequential Java code for concurrency via concurrent libraries , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[16]  Clark Verbrugge,et al.  Component-Based Lock Allocation , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[17]  Irving L. Traiger,et al.  The notions of consistency and predicate locks in a database system , 1976, CACM.

[18]  Samuel P. Midkiff,et al.  Using data structure knowledge for efficient lock generation and strong atomicity , 2010, PPoPP '10.

[19]  David Gay,et al.  Autolocker: synchronization inference for atomic sections , 2006, POPL '06.

[20]  Sumit Gulwani,et al.  Inferring locks for atomic sections , 2008, PLDI '08.

[21]  Sebastian Burckhardt,et al.  Line-up: a complete and automatic linearizability checker , 2010, PLDI '10.

[22]  Irving L. Traiger,et al.  Granularity of locks in a shared data base , 1975, VLDB '75.

[23]  Frank Tip,et al.  Associating synchronization constraints with data in an object-oriented language , 2006, POPL '06.

[24]  Shan Lu,et al.  Automated atomicity-violation fixing , 2011, PLDI '11.

[25]  Ugo Buy,et al.  Preventing database deadlocks in applications , 2013, ESEC/FSE 2013.

[26]  Jan Vitek,et al.  STMBench7: a benchmark for software transactional memory , 2007, EuroSys '07.

[27]  Scott A. Mahlke,et al.  The theory of deadlock avoidance via discrete control , 2009, POPL '09.

[28]  Victor Pankratius,et al.  A study of transactional memory vs. locks in practice , 2011, SPAA '11.

[29]  Andrew S. Grimshaw,et al.  Scalable GPU graph traversal , 2012, PPoPP '12.

[30]  Monica S. Lam,et al.  Cloning-based context-sensitive pointer alias analysis using binary decision diagrams , 2004, PLDI '04.