Accelerating program analyses by cross-program training

Practical programs share large modules of code. However, many program analyses are ineffective at reusing analysis results for shared code across programs. We present POLYMER, an analysis optimizer to address this problem. POLYMER runs the analysis offline on a corpus of training programs and learns analysis facts over shared code. It prunes the learnt facts to eliminate intermediate computations and then reuses these pruned facts to accelerate the analysis of other programs that share code with the training corpus. We have implemented POLYMER to accelerate analyses specified in Datalog, and apply it to optimize two analyses for Java programs: a call-graph analysis that is flow- and context-insensitive, and a points-to analysis that is flow- and context-sensitive. We evaluate the resulting analyses on ten programs from the DaCapo suite that share the JDK library. POLYMER achieves average speedups of 2.6× for the call- graph analysis and 5.2× for the points-to analysis.

[1]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[2]  Eric Bodden,et al.  Inter-procedural data-flow analysis with IFDS/IDE and Soot , 2012, SOAP '12.

[3]  Ondrej Lhoták,et al.  Averroes: Whole-Program Analysis without the Whole Program , 2013, ECOOP.

[4]  Xin Zhang,et al.  Hybrid top-down and bottom-up interprocedural analysis , 2014, PLDI.

[5]  Ondrej Lhoták,et al.  Pick your contexts well: understanding object-sensitivity , 2011, POPL '11.

[6]  Yannis Smaragdakis,et al.  Using Datalog for Fast and Easy Program Analysis , 2010, Datalog.

[7]  Yannis Smaragdakis,et al.  Strictly declarative specification of sophisticated points-to analyses , 2009, OOPSLA '09.

[8]  Ondrej Lhoták,et al.  Application-Only Call Graph Construction , 2012, ECOOP.

[9]  OhHakjoo,et al.  Learning a strategy for adapting a program analysis via bayesian optimisation , 2015 .

[10]  Monica S. Lam,et al.  Efficient context-sensitive pointer analysis for C programs , 1995, PLDI '95.

[11]  Reinhard Wilhelm,et al.  A semantics for procedure local heaps and its abstractions , 2005, POPL '05.

[12]  Supratik Chakraborty,et al.  Bottom-up shape analysis using LISF , 2011, TOPL.

[13]  Eran Yahav,et al.  Generating precise and concise procedure summaries , 2008, POPL '08.

[14]  Hongseok Yang,et al.  A Correspondence between Two Approaches to Interprocedural Analysis in the Presence of Join , 2014, ESOP.

[15]  Monica S. Lam,et al.  Cloning-based context-sensitive pointer alias analysis using binary decision diagrams , 2004, PLDI '04.

[16]  Patrick Cousot,et al.  Modular Static Program Analysis , 2002, CC.

[17]  Atanas Rountev,et al.  Rethinking Soot for summary-based whole-program analysis , 2012, SOAP '12.

[18]  A Pnueli,et al.  Two Approaches to Interprocedural Data Flow Analysis , 2018 .

[19]  Martin C. Rinard,et al.  Compositional pointer and escape analysis for Java programs , 1999, OOPSLA '99.

[20]  Patrick Cousot,et al.  Static Determination of Dynamic Properties of Recursive Procedures , 1977, Formal Description of Programming Concepts.

[21]  Monica S. Lam,et al.  Using Datalog with Binary Decision Diagrams for Program Analysis , 2005, APLAS.

[22]  Yannis Smaragdakis,et al.  Set-based pre-processing for points-to analysis , 2013, OOPSLA.

[23]  Padmanabhan Krishnan,et al.  Staged Points-to Analysis for Large Code Bases , 2015, CC.

[24]  Atanas Rountev,et al.  IDE Dataflow Analysis in the Presence of Large Object-Oriented Libraries , 2008, CC.

[25]  Yannis Smaragdakis,et al.  Hybrid context-sensitivity for points-to analysis , 2013, PLDI.

[26]  Sriram K. Rajamani,et al.  Bebop: a path-sensitive interprocedural dataflow engine , 2001, PASTE '01.

[27]  Barbara G. Ryder,et al.  Relevant context inference , 1999, POPL '99.

[28]  Hongseok Yang,et al.  Learning a strategy for adapting a program analysis via bayesian optimisation , 2015, OOPSLA.

[29]  Peter W. O'Hearn,et al.  Compositional Shape Analysis by Means of Bi-Abduction , 2011, JACM.

[30]  Thomas W. Reps,et al.  Precise interprocedural dataflow analysis via graph reachability , 1995, POPL '95.

[31]  Sriram K. Rajamani,et al.  Compositional may-must program analysis: unleashing the power of alternation , 2010, POPL '10.

[32]  Isil Dillig,et al.  Precise and compact modular procedure summaries for heap manipulating programs , 2011, PLDI '11.