Runtime pointer disambiguation

To optimize code effectively, compilers must deal with memory dependencies. However, the state-of-the-art heuristics available in the literature to track memory dependencies are inherently imprecise and computationally expensive. Consequently, the most advanced code transformations that compilers have today are ineffective when applied on real-world programs. The goal of this paper is to solve this conundrum through dynamic disambiguation of pointers. We provide different ways to determine at runtime when two memory locations can overlap. We then produce two versions of a code region: one that is aliasing-free - hence, easy to optimize - and another that is not. Our checks let us safely branch to the optimizable region. We have applied these ideas on Polly-LLVM, a loop optimizer built on top of the LLVM compilation infrastructure. Our experiments indicate that our method is precise, effective and useful: we can disambiguate every pair of pointer in the loop intensive Polybench benchmark suite. The result of this precision is code quality: the binaries we generate are 10% faster than those that Polly-LLVM produces without our optimization, at the -O3 optimization level of LLVM.

[1]  J. Gregory Steffan,et al.  A probabilistic pointer analysis for speculative optimizations , 2006, ASPLOS XII.

[2]  Fernando Magno Quintão Pereira,et al.  Validation of memory accesses through symbolic analyses , 2014, OOPSLA.

[3]  Roy Dz-Ching Ju,et al.  A compiler framework for speculative analysis and optimizations , 2003, PLDI '03.

[4]  Lars Ole Andersen,et al.  Program Analysis and Specialization for the C Programming Language , 2005 .

[5]  Jon Louis Bentley,et al.  Data Structures for Range Searching , 1979, CSUR.

[6]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[7]  Susan Horwitz,et al.  Precise flow-insensitive may-alias analysis is NP-hard , 1997, TOPL.

[8]  Michael F. P. O'Boyle,et al.  Portable and Transparent Host-Device Communication Optimization for GPGPU Environments , 2014, CGO '14.

[9]  Milo M. K. Martin,et al.  SoftBound: highly compatible and complete spatial memory safety for c , 2009, PLDI '09.

[10]  Albert Cohen,et al.  Polyhedral AST Generation Is More Than Scanning Polyhedra , 2015, ACM Trans. Program. Lang. Syst..

[11]  Roger Espasa,et al.  Speculative alias analysis for executable code , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.

[12]  Rudolf Eigenmann,et al.  OpenMP to GPGPU: a compiler framework for automatic translation and optimization , 2009, PPoPP '09.

[13]  Péricles Rafael Oliveira Alves,et al.  A case for a fast trip count predictor , 2015, Inf. Process. Lett..

[14]  Lawrence Rauchwerger,et al.  Hybrid Analysis: Static & Dynamic Memory Reference Analysis , 2004, International Journal of Parallel Programming.

[15]  Fernando Magno Quintão Pereira,et al.  Wave Propagation and Deep Propagation for Pointer Analysis , 2009, 2009 International Symposium on Code Generation and Optimization.

[16]  Mary Hall Managing interprocedural optimization , 1992 .

[17]  John M. Mellor-Crummey,et al.  DeadSpy: a tool to pinpoint program inefficiencies , 2012, CGO '12.

[18]  Sven Verdoolaege,et al.  isl: An Integer Set Library for the Polyhedral Model , 2010, ICMS.

[19]  Robert Metzger,et al.  Interprocedural constant propagation: an empirical study , 1993, LOPL.

[20]  Jin Lin,et al.  Data Dependence Profiling for Speculative Optimizations , 2004, CC.

[21]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[22]  Xin Zheng,et al.  Demand-driven alias analysis for C , 2008, POPL '08.

[23]  Yi Yang,et al.  A GPGPU compiler for memory optimization and parallelism management , 2010, PLDI '10.

[24]  Mark N. Wegman,et al.  Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.

[25]  Josep Torrellas,et al.  Bulk Disambiguation of Speculative Threads in Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[26]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[27]  Francky Catthoor,et al.  Polyhedral parallel code generation for CUDA , 2013, TACO.

[28]  Lawrence Rauchwerger,et al.  Scalable conditional induction variables (CIV) analysis , 2015, 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[29]  George C. Necula,et al.  CCured: type-safe retrofitting of legacy code , 2002, POPL '02.

[30]  Rudolf Bayer,et al.  Symmetric binary B-Trees: Data structure and maintenance algorithms , 1972, Acta Informatica.

[31]  Michael Wolfe,et al.  High performance compilers for parallel computing , 1995 .

[32]  Miguel Castro,et al.  Baggy Bounds Checking: An Efficient and Backwards-Compatible Defense against Out-of-Bounds Errors , 2009, USENIX Security Symposium.

[33]  John Paul Shen,et al.  Speculative disambiguation: a compilation technique for dynamic memory disambiguation , 1994, ISCA '94.

[34]  Derek Bruening,et al.  AddressSanitizer: A Fast Address Sanity Checker , 2012, USENIX Annual Technical Conference.

[35]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[36]  Markus Mock,et al.  Dynamic points-to sets: a comparison with static analyses and potential applications in program understanding and optimization , 2001, PASTE '01.

[37]  Philip J. Guo A Scalable Mixed-Level Approach to Dynamic Analysis of C and C++ Programs , 2006 .

[38]  Andrew W. Appel,et al.  Modern Compiler Implementation in Java , 1997 .

[39]  Vivek Sarkar,et al.  Inter-iteration Scalar Replacement Using Array SSA Form , 2014, CC.

[40]  Raymond Lo,et al.  Partial redundancy elimination in SSA form , 1999, TOPL.

[41]  Lawrence Rauchwerger,et al.  Logical inference techniques for loop parallelization , 2012, PLDI.

[42]  J. Ramanujam,et al.  On Recovering Multi-Dimensional Arrays in Polly , 2015 .

[43]  Zhendong Su,et al.  Fast algorithms for Dyck-CFL-reachability with applications to alias analysis , 2013, PLDI.

[44]  Sebastian Hack,et al.  Whole-function vectorization , 2011, International Symposium on Code Generation and Optimization (CGO 2011).

[45]  Barbara G. Ryder,et al.  Pointer-induced aliasing: a problem classification , 1991, POPL '91.

[46]  Uday Bondhugula,et al.  A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.

[47]  Gary S. Tyson,et al.  Exhaustive optimization phase order space exploration , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[48]  Christian Lengauer,et al.  Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation , 2012, Parallel Process. Lett..