IR-Level Dynamic Data Dependence Using Abstract Interpretation Towards Speculative Parallelization

Recently, with the wide usage of multicore architectures, automatic parallelization has become a pressing issue. Speculative parallelization, one of the most popular automatic parallelization techniques, depends on estimating probably-parallelized code parts. This in turn motivates the employment of data dependence detection techniques for these code parts to report whether they contain dependence or not in order to be parallelized. In this paper, we propose a runtime data-dependence detection technique that is based on abstract interpretation at the intermediate representation (IR) level. We apply our proposed approach on the most frequently visited blocks of the code, hot loops. Unlike most existing approaches in which data analysis occurs at compile time, our proposed method conducts the analysis immediately while interpreting the code, which in turn saves the analysis time for potentially parallelized loops. Specifically, the proposed technique depends on the concept of abstract interpretation to analyze the hot loops at runtime. This process is done by firstly computing the abstract domain for each hot loop program points. Each abstract domain is incrementally computed, till a fixpoint is achieved for all program points, and correspondingly the analysis terminates in order to consecutively detect the existence of data dependence. Once the analysis result reports a parallelization possibility for the finished hot loop, the interpreter invokes the compiler to resume the execution in a parallel fashion as recommended by our proposed approach. The proposed technique is implemented on LLVM compiler, then used to test the dependence detection for a set of kernels on the Polybench framework, and the data dependence analysis required for each kernel is studied in terms of the computation overhead.

[1]  Martin C. Rinard,et al.  Automatic parallelization of divide and conquer algorithms , 1999, PPoPP '99.

[2]  Cliff Click,et al.  The Java HotSpot Server Compiler , 2001, Java Virtual Machine Research and Technology Symposium.

[3]  Lawrence Rauchwerger,et al.  Hybrid Dependence Analysis for Automatic Parallelization , 2005 .

[4]  Jorge A. Navas,et al.  IKOS: A Framework for Static Analysis Based on Abstract Interpretation , 2014, SEFM.

[5]  Florian Martin,et al.  PAG – an efficient program analyzer generator , 1998, International Journal on Software Tools for Technology Transfer.

[6]  Uday Bondhugula,et al.  A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.

[7]  Tadao Nakamura,et al.  Whole program data dependence profiling to unveil parallel regions in the dynamic execution , 2012, 2012 IEEE International Symposium on Workload Characterization (IISWC).

[8]  Milo M. K. Martin,et al.  Formalizing the LLVM intermediate representation for verified program transformations , 2012, POPL '12.

[9]  Scott A. Mahlke,et al.  Paragon: collaborative speculative loop execution on GPU and CPU , 2012, GPGPU-5.

[10]  R. Jehadeesan,et al.  Analysis of Parallelization Techniques and Tools , 2013 .

[11]  Patrick Cousot,et al.  Abstract Interpretation Based Formal Methods and Future Challenges , 2001, Informatics.

[12]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[13]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[14]  Laura Ricci,et al.  Automatic loop parallelization: an abstract interpretation approach , 2002, Proceedings. International Conference on Parallel Computing in Electrical Engineering.

[15]  Manish Gupta,et al.  Automatic Parallelization of Recursive Procedures , 2004, International Journal of Parallel Programming.

[16]  Agostino Cortesi Widening Operators for Abstract Interpretation , 2008, 2008 Sixth IEEE International Conference on Software Engineering and Formal Methods.

[17]  Vikram S. Adve,et al.  The LLVM Compiler Framework and Infrastructure Tutorial , 2004, LCPC.

[18]  Erven Rohou,et al.  Runtime, Speculative On-Stack Parallelization of For-Loops in Binary Programs , 2018, IEEE Letters of the Computer Society.

[19]  Philippe Clauss,et al.  The Polyhedral Model Beyond Loops Recursion Optimization and Parallelization Through Polyhedral Modeling , 2019 .

[20]  José Nelson Amaral,et al.  Automatic speculative parallelization of loops using polyhedral dependence analysis , 2013, COSMIC '13.

[21]  Vincent Loechner,et al.  Dynamic and Speculative Polyhedral Parallelization Using Compiler-Generated Skeletons , 2013, International Journal of Parallel Programming.

[22]  Bruno Cabral,et al.  Automatic Parallelization: Executing Sequential Programs on a Task-Based Parallel Runtime , 2016, International Journal of Parallel Programming.

[23]  Patrick Cousot,et al.  Basic concepts of abstract interpretation , 2004, IFIP Congress Topical Sessions.

[24]  Josep Torrellas,et al.  Architectural support for scalable speculative parallelization in shared-memory multiprocessors , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[25]  YangChen,et al.  A cost-driven compilation framework for speculative parallelization of sequential programs , 2004 .

[26]  Diego R. Llanos Ferraris,et al.  Toward efficient and robust software speculative parallelization on multiprocessors , 2003, PPoPP '03.

[27]  Antonio González,et al.  Clustered speculative multithreaded processors , 1999, ICS '99.

[28]  Philippe Clauss,et al.  Polyhedral parallelization of binary code , 2012, TACO.

[29]  Babak Falsafi,et al.  Multiplex: unifying conventional and speculative thread-level parallelism on a chip multiprocessor , 2001, ICS '01.