An empirical examination of the prevalence of inhibitors to the parallelizability of open source software systems

An empirical study is presented that examines the potential to parallelize general-purpose software systems. The study is conducted on 13 open source systems comprising over 14 MLOC. Each for-loop is statically analyzed to determine if it can be parallelized or not. A for-loop that can be parallelized is termed a free-loop. Free-loops can be easily parallelized using tools such as OpenMP. For the loops that cannot be parallelized, the various inhibitors to parallelization are determined and tabulated. The data shows that the most prevalent inhibitor by far, is functions called within for-loops that have side effects. This single inhibitor poses the greatest challenge in adapting and re-engineering systems to better utilize modern multi-core architectures. This fact is somewhat contradictory to the literature, which is primarily focused on the removal of data dependencies within loops. Results of this paper also show that function calls via function pointers and virtual methods have very little impact on the for-loop parallelization process. Historical data over a 10-year period of inhibitor counts for the set of systems studied is also presented. It shows that there is little change in the potential for parallelization of loops over time.

[1]  David A. Spuler,et al.  Compiler Detection of Function Call Side Effects , 1994, Informatica.

[2]  David F. Bacon,et al.  Fast static analysis of C++ virtual function calls , 1996, OOPSLA '96.

[3]  Kleanthis Psarris,et al.  The I Test: An Improved Dependence Test for Automatic Parallelization and Vectorization , 1991, IEEE Trans. Parallel Distributed Syst..

[4]  Jonathan I. Maletic,et al.  srcSlice: very efficient and scalable forward static slicing , 2014, J. Softw. Evol. Process..

[5]  Aart J. C. Bik,et al.  Automatically exploiting implicit parallelism in Java , 1997 .

[6]  Monica S. Lam,et al.  Efficient and exact data dependence analysis , 1991, PLDI '91.

[7]  Dirk Grunwald,et al.  Reducing indirect function call overhead in C++ programs , 1994, POPL '94.

[8]  Ken Kennedy,et al.  Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .

[9]  Alessandro Orso,et al.  Classifying data dependences in the presence of pointers for program comprehension, testing, and debugging , 2004, TSEM.

[10]  C. Luk,et al.  Prospector : A Dynamic Data-Dependence Profiler To Help Parallel Programming , 2010 .

[11]  David Grove,et al.  Optimization of Object-Oriented Programs Using Static Class Hierarchy Analysis , 1995, ECOOP.

[12]  Steven W. K. Tjiang,et al.  SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.

[13]  Jonathan I. Maletic,et al.  Automatic identification of class stereotypes , 2010, 2010 IEEE International Conference on Software Maintenance.

[14]  Laurie J. Hendren,et al.  Context-sensitive interprocedural points-to analysis in the presence of function pointers , 1994, PLDI '94.

[15]  D. Novillo OpenMP and automatic parallelization in GCC Diego , 2006 .

[16]  Carlo Ghezzi,et al.  Programming language concepts , 1982 .

[17]  Toby Bloom,et al.  Proceedings of the 12th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications , 1997, OOPSLA 1997.

[18]  Konstantinos Kyriakopoulos,et al.  Data dependence testing in practice , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).

[19]  Zbigniew J. Czech,et al.  Introduction to Parallel Computing , 2017 .

[20]  Saumya K. Debray,et al.  On the Complexity of Function Pointer May-Alias Analysis , 1997, TAPSOFT.

[21]  Nadya Bliss,et al.  Addressing the Multicore Trend with Automatic Parallelization , 2007 .

[22]  John Banning,et al.  : An Efficient , 2022 .

[23]  Utpal Banerjee,et al.  Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.

[24]  Ralph E. Johnson,et al.  Relooper: refactoring for loop parallelism in Java , 2009, OOPSLA Companion.

[25]  Ken Kennedy,et al.  Practical dependence testing , 1991, PLDI '91.

[26]  D. Padua,et al.  Experimental Evaluation of Some Data Dependence Tests (extended Abstract) , 1991 .

[27]  William Pugh,et al.  The Omega test: A fast and practical integer programming algorithm for dependence analysis , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[28]  Jonathan I. Maletic,et al.  An XML-based lightweight C++ fact extractor , 2003, 11th IEEE International Workshop on Program Comprehension, 2003..

[29]  Jonathan I. Maletic,et al.  Lightweight Transformation and Fact Extraction with the srcML Toolkit , 2011, 2011 IEEE 11th International Working Conference on Source Code Analysis and Manipulation.

[30]  Jonathan I. Maletic,et al.  An XML-Based Lightweight C++ Fact Extractor , 2003, IWPC.

[31]  OrsoAlessandro,et al.  Classifying data dependences in the presence of pointers for program comprehension, testing, and debugging , 2004 .

[32]  David Grove,et al.  Call graph construction in object-oriented languages , 1997, OOPSLA '97.

[33]  Eduard Ayguadé,et al.  The trade-off between implicit and explicit data distribution in shared-memory programming paradigms , 2001, ICS '01.

[34]  Jonathan I. Maletic,et al.  Empirically Examining the Parallelizability of Open Source Software System , 2012, 2012 19th Working Conference on Reverse Engineering.

[35]  Carlo Ghezzi,et al.  Programming language concepts (2nd ed.) , 1986 .

[36]  Andrian Marcus,et al.  Supporting document and data views of source code , 2002, DocEng '02.

[37]  Tim Jacobson DEPENDENCY ANALYSIS OF FOR-LOOP STRUCTURES FOR AUTOMATIC PARALLELIZATION OF C CODE , 2003 .

[38]  Jonathan I. Maletic,et al.  Using method stereotype distribution as a signature descriptor for software systems , 2009, 2009 IEEE International Conference on Software Maintenance.

[39]  Barbara G. Ryder,et al.  Complexity of Single Level Function Pointer Aliasing Analysis , 1994 .

[40]  David A. Padua,et al.  Static and Dynamic Evaluation of Data Dependence Analysis Techniques , 1996, IEEE Trans. Parallel Distributed Syst..

[41]  GhiyaRakesh,et al.  Context-sensitive interprocedural points-to analysis in the presence of function pointers , 1994 .

[42]  Laurie J. Hendren,et al.  Practical virtual method call resolution for Java , 2000, OOPSLA '00.

[43]  Markus Mock,et al.  Program slicing with dynamic points-to sets , 2005, IEEE Transactions on Software Engineering.

[44]  Edsger W. Dijkstra,et al.  Go To Statement Considered Harmful , 2022, Software Pioneers.

[45]  Wen-mei W. Hwu,et al.  An Empirical Study of Function Pointers Using SPEC Benchmarks , 1999, LCPC.

[46]  David A. Padua,et al.  Static and dynamic evaluation of data dependence analysis , 1993, ICS '93.

[47]  Michael Stumm,et al.  Loop and Data Transformations: A Tutorial , 1993 .

[48]  David A. Patterson,et al.  Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .

[49]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .