The Complexity of Andersen's Analysis in Practice

While the tightest proven worst-case complexity for Andersen's points-to analysis is nearly cubic, the analysis seems to scale better on real-world codes. We examine algorithmic factors that help account for this gap. In particular, we show that a simple algorithm can compute Andersen's analysis in worst-case quadratic time as long as the input program is k -sparse , i.e. it has at most k statements dereferencing each variable and a sparse flow graph. We then argue that for strongly-typed languages like Java, typical structure makes programs likely to be k -sparse, and we give empirical measurements across a suite of Java programs that confirm this hypothesis. We also discuss how various standard implementation techniques yield further constant-factor speedups.

[1]  Alexander Aiken,et al.  Projection merging: reducing redundancies in inclusion constraint graphs , 2000, POPL '00.

[2]  Simon Goldsmith,et al.  Measuring empirical computational complexity , 2007, ESEC-FSE '07.

[3]  Simon L. Peyton Jones,et al.  Imperative functional programming , 1993, POPL '93.

[4]  Swarat Chaudhuri,et al.  Subcubic algorithms for recursive state machines , 2008, POPL '08.

[5]  Ben Hardekopf,et al.  Exploiting Pointer and Location Equivalence to Optimize Pointer Analysis , 2007, SAS.

[6]  Ondrej Lhoták,et al.  Context-Sensitive Points-to Analysis: Is It Worth It? , 2006, CC.

[7]  Atanas Rountev,et al.  Off-line variable substitution for scaling points-to analysis , 2000, PLDI '00.

[8]  Thomas W. Reps,et al.  Program analysis via graph reachability , 1997, Inf. Softw. Technol..

[9]  O. Lhoták Spark: A flexible points-to analysis framework for Java , 2002 .

[10]  Lars Ole Andersen,et al.  Program Analysis and Specialization for the C Programming Language , 2005 .

[11]  Chris Hankin,et al.  Online cycle detection and difference propagation for pointer analysis , 2003, Proceedings Third IEEE International Workshop on Source Code Analysis and Manipulation.

[12]  David A. McAllester,et al.  Linear-time subtransitive control flow analysis , 1997, PLDI '97.

[13]  Manu Sridharan,et al.  Demand-driven points-to analysis for Java , 2005, OOPSLA '05.

[14]  Olin Shivers,et al.  Control flow analysis in scheme , 1988, PLDI '88.

[15]  Alexander Aiken,et al.  Banshee: A Scalable Constraint-Based Analysis Toolkit , 2005, SAS.

[16]  Olivier Tardieu,et al.  Demand-driven pointer analysis , 2001, PLDI '01.

[17]  Helmut Seidl,et al.  Propagating Differences: An Efficient New Fixpoint Algorithm for Distributive Constraint Systems , 1998, Nord. J. Comput..

[18]  Ben Hardekopf,et al.  The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code , 2007, PLDI '07.

[19]  Monica S. Lam,et al.  Cloning-based context-sensitive pointer alias analysis using binary decision diagrams , 2004, PLDI '04.

[20]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[21]  Barbara G. Ryder,et al.  Points-to analysis for Java using annotated constraints , 2001, OOPSLA '01.

[22]  Ondrej Lhoták,et al.  Scaling Java Points-to Analysis Using SPARK , 2003, CC.

[23]  Barbara G. Ryder,et al.  Parameterized object sensitivity for points-to analysis for Java , 2005, TSEM.

[24]  Jianwen Zhu,et al.  Symbolic pointer analysis revisited , 2004, PLDI '04.

[25]  David A. McAllester,et al.  On the cubic bottleneck in subtyping and flow analysis , 1997, Proceedings of Twelfth Annual IEEE Symposium on Logic in Computer Science.

[26]  David J. Pearce,et al.  Some directed graph algorithms and their application to pointer analysis , 2005 .

[27]  Bjarne Steensgaard,et al.  Points-to analysis in almost linear time , 1996, POPL '96.

[28]  Alexander Aiken,et al.  Partial online cycle elimination in inclusion constraint graphs , 1998, PLDI.

[29]  Manu Sridharan,et al.  Refinement-based context-sensitive points-to analysis for Java , 2006, PLDI '06.

[30]  Ondrej Lhoták,et al.  Points-to analysis using BDDs , 2003, PLDI '03.

[31]  Olivier Tardieu,et al.  Ultra-fast aliasing analysis using CLA: a million lines of C code in a second , 2001, PLDI '01.