Combining range and inequality information for pointer disambiguation

Pentagons is an abstract domain invented by Logozzo and Fahndrich to validate array accesses in low-level programming languages. This algebraic structure provides a cheap “less-than check”, which builds a partial order between the integer variables used in a program. In this paper, we show how we have used the ideas available in Pentagons to design and implement a novel alias analysis. This new algorithm lets us disambiguate pointers with offsets, so common in C-style pointer arithmetics, in a precise and efficient way. Together with this new abstract domain we describe several implementation decisions that lets us produce a practical pointer disambiguation algorithm on top of the LLVM compiler. Our alias analysis is able to handle programs as large as SPEC’s gcc in a few minutes. Furthermore, we have been able to improve the percentage of pairs of pointers disambiguated, when compared to LLVM’s built-in analyses, by a four-fold factor in some benchmarks.

[1]  Fernando Magno Quintão Pereira,et al.  Validation of memory accesses through symbolic analyses , 2014, OOPSLA.

[2]  Lars Ole Andersen,et al.  Program Analysis and Specialization for the C Programming Language , 2005 .

[3]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[4]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[5]  Péricles Rafael Oliveira Alves,et al.  Runtime pointer disambiguation , 2015, OOPSLA.

[6]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools (2nd Edition) , 2006 .

[7]  Fernando Magno Quintão Pereira,et al.  A fast and low-overhead technique to secure programs against integer overflows , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[8]  Manuel Fähndrich,et al.  Static Contract Checking with Abstract Interpretation , 2010, FoVeOOS.

[9]  Manuel Fähndrich,et al.  Pentagons: a weakly relational abstract domain for the efficient validation of array accesses , 2008, SAC '08.

[10]  Monica S. Lam,et al.  Efficient context-sensitive pointer analysis for C programs , 1995, PLDI '95.

[11]  Laure Gonnord,et al.  Pointer disambiguation via strict inequalities , 2017, 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[12]  Derek Bruening,et al.  AddressSanitizer: A Fast Address Sanity Checker , 2012, USENIX Annual Technical Conference.

[13]  Ben Hardekopf,et al.  Flow-sensitive pointer analysis for millions of lines of code , 2011, International Symposium on Code Generation and Optimization (CGO 2011).

[14]  Thomas W. Reps,et al.  Analyzing Memory Accesses in x86 Executables , 2004, CC.

[15]  Antoine Miné,et al.  The octagon abstract domain , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[16]  Jeremy Singer,et al.  Static program analysis based on virtual register renaming , 2006 .

[17]  Rudolf Eigenmann,et al.  Symbolic range propagation , 1995, Proceedings of 9th International Parallel Processing Symposium.

[18]  Mark N. Wegman,et al.  An efficient method of computing static single assignment form , 1989, POPL '89.

[19]  Vivek Sarkar,et al.  Inter-iteration Scalar Replacement Using Array SSA Form , 2014, CC.

[20]  David Monniaux,et al.  Succinct Representations for Abstract Interpretation - Combined Analysis Algorithms and Experimental Evaluation , 2012, SAS.

[21]  Hao Zhou,et al.  Loop-oriented array- and field-sensitive pointer analysis for automatic SIMD vectorization , 2016, LCTES.

[22]  Michael Wolfe,et al.  High performance compilers for parallel computing , 1995 .

[23]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[24]  Bowen Alpern,et al.  Detecting equality of variables in programs , 1988, POPL '88.

[25]  Kyle A. Gallivan,et al.  A unified framework for nonlinear dependence testing and symbolic analysis , 2004, ICS '04.

[26]  Fernando Magno Quintão Pereira,et al.  Wave Propagation and Deep Propagation for Pointer Analysis , 2009, 2009 International Symposium on Code Generation and Optimization.

[27]  Michael Hind,et al.  Pointer analysis: haven't we solved this problem yet? , 2001, PASTE '01.

[28]  Flemming Nielson,et al.  Principles of Program Analysis , 1999, Springer Berlin Heidelberg.

[29]  Martin C. Rinard,et al.  Symbolic bounds analysis of pointers, array indices, and accessed memory regions , 2005, TOPL.

[30]  Fabrice Rastello,et al.  Parameterized Construction of Program Representations for Sparse Dataflow Analyses , 2014, CC.

[31]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[32]  Martin Odersky,et al.  Call graphs for languages with parametric polymorphism , 2016, OOPSLA.

[33]  Yannis Smaragdakis,et al.  Structure-Sensitive Points-To Analysis for C and C++ , 2016, SAS.

[34]  Jens Dietrich,et al.  Giga-scale exhaustive points-to analysis for Java in under a minute , 2015, OOPSLA.

[35]  Vivek Sarkar,et al.  ABCD: eliminating array bounds checks on demand , 2000, PLDI '00.

[36]  Lawrence Rauchwerger,et al.  Hybrid Analysis: Static & Dynamic Memory Reference Analysis , 2004, International Journal of Parallel Programming.

[37]  Fernando Magno Quintão Pereira,et al.  Symbolic range analysis of pointers , 2016, 2016 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[38]  Patrick Cousot,et al.  The ASTREÉ Analyzer , 2005, ESOP.

[39]  Scott Moore,et al.  Exploring and enforcing security guarantees via program dependence graphs , 2015, PLDI.

[40]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[41]  Ben Hardekopf,et al.  The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code , 2007, PLDI '07.

[42]  Nicolas Halbwachs,et al.  Automatic discovery of linear restraints among variables of a program , 1978, POPL.

[43]  Bjarne Steensgaard,et al.  Points-to analysis in almost linear time , 1996, POPL '96.