Time- and space-efficient flow-sensitive points-to analysis

Compilation of real-world programs often requires hours. The term nightly build known to industrial researchers is an artifact of long compilation times. Our goal is to reduce the absolute analysis times for large C codes (of the order of millions of lines). Pointer analysis is one of the key analyses performed during compilation. Its scalability is paramount to achieve the efficiency of the overall compilation process and its precision directly affects that of the client analyses. In this work, we design a time- and space-efficient flow-sensitive pointer analysis and parallelize it on graphics processing units. Our analysis proposes to use an extended bloom filter, called multibloom, to store points-to information in an approximate manner and develops an analysis in terms of the operations over the multibloom. Since bloom filter is a probabilistic data structure, we develop ways to gain back the analysis precision. We achieve effective parallelization by achieving memory coalescing, reducing thread divergence, and improving load balance across GPU warps. Compared to a state-of-the-art sequential solution, our parallel version achieves a 7.8 × speedup with less than 5% precision loss on a suite of six large programs. Using two client transformations, we show that this loss in precision only minimally affects a client’s precision.

[1]  Calvin Lin,et al.  Efficient and extensible security enforcement using dynamic data flow analysis , 2008, CCS.

[2]  Pradeep Dubey,et al.  Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort , 2010, SIGMOD Conference.

[3]  Keshav Pingali,et al.  Parallel inclusion-based points-to analysis , 2010, OOPSLA.

[4]  Alexander Aiken,et al.  Partial online cycle elimination in inclusion constraint graphs , 1998, PLDI.

[5]  Ondrej Lhoták,et al.  Points-to analysis using BDDs , 2003, PLDI '03.

[6]  Barbara G. Ryder,et al.  Program decomposition for pointer aliasing: a step toward practical analyses , 1996, SIGSOFT '96.

[7]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[8]  Jianwen Zhu,et al.  Symbolic pointer analysis revisited , 2004, PLDI '04.

[9]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[10]  Laurie J. Hendren,et al.  Context-sensitive interprocedural points-to analysis in the presence of function pointers , 1994, PLDI '94.

[11]  Hongtao Yu,et al.  Level by level: making flow- and context-sensitive pointer analysis scalable for millions of lines of code , 2010, CGO '10.

[12]  Rajiv Gupta,et al.  The combining DAG: a technique for parallel data flow analysis , 1992, Proceedings Sixth International Parallel Processing Symposium.

[13]  Lee L. Gremillion Designing a Bloom filter for differential file access , 1982, CACM.

[14]  Udi Manber,et al.  An Algorithm for Approximate Membership checking with Application to Password Security , 1994, Inf. Process. Lett..

[15]  Guy M. Lohman,et al.  Optimizer Validation and Performance Evaluation for Distributed Queries , 1998 .

[16]  Michael Garland,et al.  Designing efficient sorting algorithms for manycore GPUs , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[17]  Michael Hind,et al.  Assessing the Effects of Flow-Sensitivity on Pointer Alias Analyses , 1998, SAS.

[18]  Barbara G. Ryder,et al.  Interprocedural modification side effect analysis with pointer aliasing , 1993, PLDI '93.

[19]  Welf Löwe,et al.  Parallel points-to analysis for multi-core machines , 2011, HiPEAC.

[20]  Lars Ole Andersen,et al.  Program Analysis and Specialization for the C Programming Language , 2005 .

[21]  Ondrej Lhoták,et al.  Points-to analysis with efficient strong updates , 2011, POPL '11.

[22]  Keshav Pingali,et al.  A GPU implementation of inclusion-based points-to analysis , 2012, PPoPP '12.

[23]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[24]  Jong-Deok Choi,et al.  Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side effects , 1993, POPL '93.

[25]  Monica S. Lam,et al.  Cloning-based context-sensitive pointer alias analysis using binary decision diagrams , 2004, PLDI '04.

[26]  Vineet Kahlon Bootstrapping: a technique for scalable flow and context-sensitive pointer alias analysis , 2008, PLDI '08.

[27]  Michael Hind,et al.  Which pointer analysis should I use? , 2000, ISSTA '00.

[28]  Chris Hankin,et al.  Online Cycle Detection and Difference Propagation: Applications to Pointer Analysis , 2004, Software Quality Journal.

[29]  Keshav Pingali,et al.  Morph algorithms on GPUs , 2013, PPoPP '13.

[30]  Bjarne Steensgaard,et al.  Points-to analysis in almost linear time , 1996, POPL '96.

[31]  Deepak D'Souza,et al.  Scalable Flow-Sensitive Pointer Analysis for Java with Strong Updates , 2012, ECOOP.

[32]  Martin C. Rinard,et al.  Pointer and escape analysis for multithreaded programs , 2001, PPoPP '01.

[33]  Barbara G. Ryder,et al.  A safe approximate algorithm for interprocedural aliasing , 1992, PLDI '92.

[34]  Ben Hardekopf,et al.  The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code , 2007, PLDI '07.

[35]  Matthew Might,et al.  EigenCFA: accelerating flow analysis with GPUs , 2011, POPL '11.

[36]  Monica S. Lam,et al.  Efficient context-sensitive pointer analysis for C programs , 1995, PLDI '95.

[37]  Rupesh Nasre,et al.  Parallel Replication-Based Points-To Analysis , 2012, CC.

[38]  Lian Li,et al.  Boosting the performance of flow-sensitive points-to analysis using value flow , 2011, ESEC/FSE '11.

[39]  Erik Ruf Partitioning dataflow analyses using types , 1997, POPL '97.

[40]  Eran Yahav,et al.  Effective typestate verification in the presence of aliasing , 2006, TSEM.

[41]  Michael Mitzenmacher,et al.  Compressed bloom filters , 2001, PODC '01.

[42]  Yossi Matias,et al.  Spectral bloom filters , 2003, SIGMOD '03.

[43]  Barbara G. Ryder,et al.  Performing data flow analysis in parallel , 1990, Proceedings SUPERCOMPUTING '90.

[44]  R. Govindarajan,et al.  Scalable Context-Sensitive Points-to Analysis Using Multi-dimensional Bloom Filters , 2009, APLAS.

[45]  Ben Hardekopf,et al.  Semi-sparse flow-sensitive pointer analysis , 2009, POPL '09.

[46]  Richard M. Stallman,et al.  Using the GNU Compiler Collection , 2010 .

[47]  Ben Hardekopf,et al.  Flow-sensitive pointer analysis for millions of lines of code , 2011, International Symposium on Code Generation and Optimization (CGO 2011).

[48]  Calvin Lin,et al.  Error checking with client-driven pointer analysis , 2005, Sci. Comput. Program..