SCAF: a speculation-aware collaborative dependence analysis framework

Program analysis determines the potential dataflow and control flow relationships among instructions so that compiler optimizations can respect these relationships to transform code correctly. Since many of these relationships rarely or never occur, speculative optimizations assert they do not exist while optimizing the code. To preserve correctness, speculative optimizations add validation checks to activate recovery code when these assertions prove untrue. This approach results in many missed opportunities because program analysis and thus other optimizations remain unaware of the full impact of these dynamically-enforced speculative assertions. To address this problem, this paper presents SCAF, a Speculation-aware Collaborative dependence Analysis Framework. SCAF learns of available speculative assertions via profiling, computes their full impact on memory dependence analysis, and makes this resulting information available for all code optimizations. SCAF is modular (adding new analysis modules is easy) and collaborative (modules cooperate to produce a result more precise than the confluence of all individual results). Relative to the best prior speculation-aware dependence analysis technique, by computing the full impact of speculation on memory dependence analysis, SCAF dramatically reduces the need for expensive-to-validate memory speculation in the hot loops of all 16 evaluated C/C++ SPEC benchmarks.

[1]  Vikram S. Adve,et al.  Making context-sensitive points-to analysis with heap cloning practical for the real world , 2007, PLDI '07.

[2]  Lawrence Rauchwerger,et al.  The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization , 1995, PLDI '95.

[3]  Lars Ole Andersen,et al.  Program Analysis and Specialization for the C Programming Language , 2005 .

[4]  Brad Calder,et al.  Threaded multiple path execution , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).

[5]  Benjamin Livshits,et al.  Context-sensitive program analysis as database queries , 2005, PODS.

[6]  Yannis Smaragdakis,et al.  Exception analysis and points-to analysis: better together , 2009, ISSTA.

[7]  Monica S. Lam,et al.  Cloning-based context-sensitive pointer alias analysis using binary decision diagrams , 2004, PLDI '04.

[8]  Rajiv Gupta,et al.  Supporting speculative parallelization in the presence of dynamic data structures , 2010, PLDI '10.

[9]  Jin Lin,et al.  Data Dependence Profiling for Speculative Optimizations , 2004, CC.

[10]  Arun Raman,et al.  Speculative parallelization using software multi-threaded transactions , 2010, ASPLOS XV.

[11]  David I. August,et al.  Shape analysis with inductive recursion synthesis , 2007, PLDI '07.

[12]  Nick P. Johnson Static Dependence Analysis in an Infrastructure for Automatic Parallelization , 2015 .

[13]  David I. August,et al.  A collaborative dependence analysis framework , 2017, 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[14]  Chen Ding,et al.  Fast Track: A Software System for Speculative Program Optimization , 2009, 2009 International Symposium on Code Generation and Optimization.

[15]  Roger Espasa,et al.  Speculative alias analysis for executable code , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.

[16]  Scott A. Mahlke,et al.  Uncovering hidden loop level parallelism in sequential applications , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[17]  Satish Narayanasamy,et al.  Optimistic Hybrid Analysis: Accelerating Dynamic Analysis through Predicated Static Analysis , 2018, ASPLOS.

[18]  Björn Franke,et al.  Generalized profile-guided iterator recognition , 2018, CC.

[19]  Antonia Zhai,et al.  Improving value communication for thread-level speculation , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[20]  William Landi,et al.  Undecidability of static analysis , 1992, LOPL.

[21]  William Pugh,et al.  The Omega test: A fast and practical integer programming algorithm for dependence analysis , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[22]  Josep Torrellas,et al.  Eliminating squashes through learning cross-thread violations in speculative parallelization for multiprocessors , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[23]  Easwaran Raman,et al.  Speculative Decoupled Software Pipelining , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[24]  Soumyadeep Ghosh,et al.  Speculatively exploiting cross-invocation parallelism , 2016, 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT).

[25]  Patrick Cousot,et al.  The Reduced Product of Abstract Domains and the Combination of Decision Procedures , 2011, FoSSaCS.

[26]  Feng Liu,et al.  Scalable Speculative Parallelization on Commodity Clusters , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[27]  Roy Dz-Ching Ju,et al.  A compiler framework for speculative analysis and optimizations , 2003, PLDI '03.

[28]  Hansen Zhang,et al.  Hardware Multithreaded Transactions , 2018, ASPLOS.

[29]  Scott A. Mahlke,et al.  Automatic speculative DOALL for clusters , 2012, CGO '12.

[30]  Rajiv Gupta,et al.  Copy or Discard execution model for speculative parallelization on multicores , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[31]  PaduaDavid,et al.  The LRPD test , 1995 .

[32]  Avi Mendelson,et al.  Can program profiling support value prediction? , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[33]  David I. August,et al.  Perspective: A Sensible Approach to Speculative Automatic Parallelization , 2020, ASPLOS.

[34]  Rajiv Gupta,et al.  Speculative parallelization using state separation and multiple value prediction , 2010, ISMM '10.

[35]  Lawrence Rauchwerger,et al.  Sensitivity analysis for automatic parallelization on multi-cores , 2007, ICS '07.

[36]  Ayal Zaks,et al.  Speculative separation for privatization and reductions , 2012, PLDI.

[37]  Craig B. Zilles,et al.  Hardware atomicity for reliable software speculation , 2007, ISCA '07.

[38]  Yannis Smaragdakis,et al.  Strictly declarative specification of sophisticated points-to analyses , 2009, OOPSLA '09.

[39]  Sam Blackshear,et al.  Thresher: precise refutations for heap reachability , 2013, PLDI.

[40]  Ondrej Lhoták,et al.  Evaluating the benefits of context-sensitive points-to analysis using a BDD-based implementation , 2008, TSEM.

[41]  Greg Nelson,et al.  Simplification by Cooperating Decision Procedures , 1979, TOPL.

[42]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[43]  Michael Hind,et al.  Pointer analysis: haven't we solved this problem yet? , 2001, PASTE '01.

[44]  David I. August,et al.  A Generalized Framework for Automatic Scripting Language Parallelization , 2017, 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[45]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[46]  Ondrej Lhoták,et al.  Program analysis using binary decision diagrams , 2006 .

[47]  Lawrence Rauchwerger,et al.  Hybrid Analysis: Static & Dynamic Memory Reference Analysis , 2004, International Journal of Parallel Programming.

[48]  Kunle Olukotun,et al.  Using thread-level speculation to simplify manual parallelization , 2003, PPoPP '03.

[49]  Scott A. Mahlke,et al.  Tolerating First Level Memory Access Latency in High-Performance Systems , 1992, ICPP.

[50]  Qiang Wu,et al.  Exposing memory access regularities using object-relative memory profiling , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[51]  Ayal Zaks,et al.  Fast condensation of the program dependence graph , 2013, PLDI.

[52]  Scott A. Mahlke,et al.  Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory , 2009, PLDI '09.

[53]  Laurie J. Hendren,et al.  Is it a tree, a DAG, or a cyclic graph? A shape analysis for heap-directed pointers in C , 1996, POPL '96.

[54]  Utpal Banerjee Loop Parallelization , 1994, Springer US.

[55]  Reinhard Wilhelm,et al.  Solving shape-analysis problems in languages with destructive updating , 1998, TOPL.

[56]  Maged M. Michael,et al.  The promise of STM may likely be undermined by its overheads and workload applicabilities , 2008 .

[57]  Chen Ding,et al.  Software behavior oriented parallelization , 2007, PLDI '07.

[58]  Bjarne Steensgaard,et al.  Points-to analysis in almost linear time , 1996, POPL '96.