Scalable pointer analysis of data structures using semantic models

Pointer analysis is widely used as a base for different kinds of static analyses and compiler optimizations. Designing a scalable pointer analysis with acceptable precision for use in production compilers is still an open question. Modern object oriented languages like Java and Scala promote abstractions and code reuse, both of which make it difficult to achieve precision. Collection data structures are an example of a pervasively used component in such languages. But analyzing collection implementations with full context sensitivity leads to prohibitively long analysis times. We use semantic models to reduce the complex internal implementation of, e.g., a collection to a small and concise model. Analyzing the model with context sensitivity leads to precise results with only a modest increase in analysis time. The models must be written manually, which is feasible because a model method usually consists of only a few statements. Our implementation in GraalVM Native Image shows a rise in useful precision (1.35X rise in the number of checkcast statements that can be elided over the default analysis configuration) with a manageable performance cost (19% rise in analysis time).

[1]  Benjamin Livshits,et al.  Finding Security Vulnerabilities in Java Applications with Static Analysis , 2005, USENIX Security Symposium.

[2]  Eran Yahav,et al.  Generating precise and concise procedure summaries , 2008, POPL '08.

[3]  David A. Padua,et al.  Monotonic evolution: an alternative to induction variable substitution for dependence analysis , 2001, ICS '01.

[4]  Laurie J. Hendren,et al.  Is it a tree, a DAG, or a cyclic graph? A shape analysis for heap-directed pointers in C , 1996, POPL '96.

[5]  Austin T. Clements,et al.  The scalable commutativity rule: designing scalable software for multicore processors , 2013, SOSP.

[6]  Eric Darve,et al.  Liszt: A domain specific language for building portable mesh-based PDE solvers , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[7]  Lars Ole Andersen,et al.  Program Analysis and Specialization for the C Programming Language , 2005 .

[8]  Hanspeter Mössenböck,et al.  Trace-based Register Allocation in a JIT Compiler , 2016, PPPJ.

[9]  Reinhard Wilhelm,et al.  Parametric shape analysis via 3-valued logic , 1999, POPL '99.

[10]  Hanspeter Mössenböck,et al.  An intermediate representation for speculative optimizations in a dynamic compiler , 2013, VMIL '13.

[11]  Hanspeter Mössenböck,et al.  A cost model for a graph-based intermediate-representation in a dynamic compiler , 2018, VMIL@SPLASH.

[12]  Yannis Smaragdakis,et al.  Scalability-first pointer analysis with self-tuning context-sensitivity , 2018, ESEC/SIGSOFT FSE.

[13]  Keshav Pingali,et al.  Exploiting the commutativity lattice , 2011, PLDI '11.

[14]  Calvin Lin,et al.  Error checking with client-driven pointer analysis , 2005, Sci. Comput. Program..

[15]  Peter W. O'Hearn,et al.  Compositional Shape Analysis by Means of Bi-Abduction , 2011, JACM.

[16]  Kunle Olukotun,et al.  Green-Marl: a DSL for easy and efficient graph analysis , 2012, ASPLOS XVII.

[17]  Peter W. O'Hearn,et al.  Scalable Shape Analysis for Systems Code , 2008, CAV.

[18]  Christopher Fallin,et al.  Finding and Exploiting Parallelism with Data-Structure-Aware Static and Dynamic Analysis , 2019 .

[19]  Xin Zhang,et al.  Hybrid top-down and bottom-up interprocedural analysis , 2014, PLDI.

[20]  Jeffrey Overbey,et al.  A type and effect system for deterministic parallel Java , 2009, OOPSLA '09.

[21]  Chandra Krintz,et al.  SIGPLAN programming language curriculum workshop: Workshop organization , 2008, SIGP.

[22]  Viktor Kuncak,et al.  An overview of the Jahob analysis system: project goals and current status , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[23]  Peter Hofer,et al.  Initialize once, start fast: application initialization at build time , 2019, Proc. ACM Program. Lang..

[24]  Yannis Smaragdakis,et al.  Shooting from the heap: ultra-scalable static analysis with heap snapshots , 2018, ISSTA.

[25]  Jan Eberhardt,et al.  Unsupervised learning of API aliasing specifications , 2019, PLDI.

[26]  Martin C. Rinard,et al.  Automatic parallelization of divide and conquer algorithms , 1999, PPoPP '99.

[27]  Athanasios K. Tsakalidis,et al.  Evaluating Twitter Influence Ranking with System Theory , 2016, WEBIST.

[28]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[29]  Yannis Smaragdakis,et al.  Pointer Analysis , 2015, Found. Trends Program. Lang..

[30]  Jingling Xue,et al.  Making k-Object-Sensitive Pointer Analysis More Precise with Still k-Limiting , 2016, SAS.

[31]  A Pnueli,et al.  Two Approaches to Interprocedural Data Flow Analysis , 2018 .

[32]  Hongtao Yu,et al.  Level by level: making flow- and context-sensitive pointer analysis scalable for millions of lines of code , 2010, CGO '10.

[33]  Giovanni De Micheli,et al.  Applying pointer analysis to the synthesis of hardware from C , 2001 .

[34]  Barbara G. Ryder,et al.  Parameterized object sensitivity for points-to analysis for Java , 2005, TSEM.

[35]  Frédo Durand,et al.  Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI 2013.

[36]  Peng Wu,et al.  Pointer Analysis for Monotonic Container TraversalsAlbert , 2001 .

[37]  Jingling Xue,et al.  Efficient and precise points-to analysis: modeling the heap by merging equivalent automata , 2017, PLDI.

[38]  Richard W. Vuduc,et al.  Annotating user-defined abstractions for optimization , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[39]  Yannis Smaragdakis,et al.  P/Taint: unified points-to and taint analysis , 2017, Proc. ACM Program. Lang..

[40]  Cristina Cifuentes,et al.  Frappé: Querying the Linux Kernel Dependency Graph , 2015, GRADES@SIGMOD/PODS.

[41]  Elliot Berk,et al.  JLex: A lexical analyzer generator for Java , 2004 .

[42]  Hanspeter Mössenböck,et al.  Speculation without regret: reducing deoptimization meta-data in the Graal compiler , 2014, PPPJ '14.

[43]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[44]  Dan Grossman SIGPLAN education board and related activities report , 2011 .