Off-line variable substitution for scaling points-to analysis

Most compiler optimizations and software productivity tools rely oninformation about the effects of pointer dereferences in a program.The purpose of points-to analysis is to compute this informationsafely, and as accurately as is practical. Unfortunately, accuratepoints-to information is difficult to obtain for large programs,because the time and space requirements of the analysis becomeprohibitive. We consider the problem of scaling flow- and context-insensitivepoints-to analysis to large programs, perhaps containing hundreds ofthousands of lines of code. Our approach is based on a variable substitution transformation, which is performed off-line, i.e.,before a standard points-to analysis is performed. The general idea ofvariable substitution is that a set of variables in a program can be replaced by a single representative variable, thereby reducing the input size of the problem. Our main contribution is a linear-time algorithm which finds a particular variable substitution that maintains the precision of the standard analysis, and is also very effective in reducing the size of the problem. We report our experience in performing points-to analysis on largeC programs, including some industrial-sized ones. Experiments show thatour algorithm can reduce the cost of Andersen's points-to analysis substantially: on average, it reduced the running time by 53% and the memory cost by 59%, relative to an efficient baseline implementation of the analysis.

[1]  Alexander Aiken,et al.  A Toolkit for Constructing Type- and Constraint-Based Program Analyses , 1998, Types in Compilation.

[2]  Alexander Aiken,et al.  Projection merging: reducing redundancies in inclusion constraint graphs , 2000, POPL '00.

[3]  Barbara G. Ryder,et al.  Program decomposition for pointer aliasing: a step toward practical analyses , 1996, SIGSOFT '96.

[4]  David Grove,et al.  Fast interprocedural class analysis , 1998, POPL '98.

[5]  Laurie J. Hendren,et al.  Is it a tree, a DAG, or a cyclic graph? A shape analysis for heap-directed pointers in C , 1996, POPL '96.

[6]  Susan Horwitz,et al.  Fast and accurate flow-insensitive points-to analysis , 1997, POPL '97.

[7]  Donglin Liang,et al.  Efficient points-to analysis for whole-program analysis , 1999, ESEC/FSE-7.

[8]  Jong-Deok Choi,et al.  Interprocedural pointer alias analysis , 1999, TOPL.

[9]  John H. Reppy A High-performance Garbage Collector for Standard ML , 1993 .

[10]  Robert E. Tarjan,et al.  Data structures and network algorithms , 1983, CBMS-NSF regional conference series in applied mathematics.

[11]  Donglin Liang,et al.  Equivalence analysis: a general technique to improve the efficiency of data-flow analyses in the presence of pointers , 1999, ACM SIGSOFT Softw. Eng. Notes.

[12]  Barbara G. Ryder,et al.  Data-flow analysis of program fragments , 1999, ESEC/FSE-7.

[13]  Barbara G. Ryder,et al.  A safe approximate algorithm for interprocedural aliasing , 1992, PLDI '92.

[14]  Reinhard Wilhelm,et al.  Solving shape-analysis problems in languages with destructive updating , 1998, TOPL.

[15]  Alexander Aiken,et al.  Partial online cycle elimination in inclusion constraint graphs , 1998, PLDI.

[16]  Laurie J. Hendren,et al.  Context-sensitive interprocedural points-to analysis in the presence of function pointers , 1994, PLDI '94.

[17]  Bjarne Steensgaard,et al.  Points-to analysis in almost linear time , 1996, POPL '96.

[18]  Monica S. Lam,et al.  Efficient context-sensitive pointer analysis for C programs , 1995, PLDI '95.