A parallel worklist algorithm and its exploration heuristics for static modular analyses

Abstract One way to speed up static programme analysis is to make use of today’s multi-core CPUs by parallelising the analysis. Existing work on parallel analysis usually targets traditional data-flow analyses for static, first-order languages such as C. Less attention has been given so far to the parallelisation of more general analyses that can also target dynamic, higher-order languages such as JavaScript. These are significantly more challenging to parallelise, as dependencies between analysis results are only discovered during the analysis itself. State-of-the-art parallel analyses for such languages are therefore usually limited, both in their applicability and performance gains. In this work, we propose the parallelisation of modular analyses. Modular analyses compute different parts of the analysis in isolation of one another, and therefore offer inherent opportunities for parallelisation that have not been explored so far. In addition, they can be used to develop a general class of analysers for dynamic, higher-order languages. We present a parallel variant of the worklist algorithm that is used to drive such modular analyses. To further speed up its convergence, we show how this algorithm can exploit the monotonicity of the analysis. Existing modular analyses can be parallelised without additional effort by instead employing this parallel worklist algorithm. We demonstrate this for ModF , an inter-procedural modular analysis, and for ModConc , an inter-process modular analysis. For ModConc , we reveal an additional opportunity to exploit even more parallelism in the analysis: analyses of individual ModConc components can themselves be parallel, resulting in a doubly-parallel exploration. Finally, we present several heuristics for the exploration order of the analysis and discuss how they can impact its performance. The parallel worklist algorithm and the exploration heuristics are implemented for and integrated into MAF, a framework for modular programme analysis. On a set of Scheme benchmarks for ModF , we observe speedups between 3 × and 8 × when using 4 workers, and speedups between 8 × and 32 × when using 16 workers, with a maximum speedup of 333 × using 128 workers. For ModConc , we achieve a maximum speedup of 37 × with 32 workers. We observe that on a ModF analysis, among 11 exploration heuristics, the heuristics prioritising either components with smaller environments or with less dependencies result in consistent speedups that can reach 20 × those of a random exploration strategy. We find a clear correlation between the mean number of dependencies in a programme and the speedup obtained by this heuristic.

[1]  Coen De Roover,et al.  MAF: A Framework for Modular Static Analysis of Higher-Order Languages , 2020, 2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM).

[2]  Ondrej Lhoták,et al.  Pick your contexts well: understanding object-sensitivity , 2011, POPL '11.

[3]  Coen De Roover,et al.  Effect-Driven Flow Analysis , 2019, VMCAI.

[4]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[5]  Rajiv Gupta,et al.  The Combining DAG: A Technique for Parallel Data Flow Analysis , 1994, IEEE Trans. Parallel Distributed Syst..

[6]  Robert W. Bowdidge,et al.  Why don't software developers use static analysis tools to find bugs? , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[7]  Coen De Roover,et al.  A Parallel Worklist Algorithm for Modular Analyses , 2020, 2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM).

[8]  Coen De Roover,et al.  A general method for rendering static analyses for diverse concurrency models modular , 2019, J. Syst. Softw..

[9]  Patrick Cousot,et al.  Modular Static Program Analysis , 2002, CC.

[10]  Suresh Jagannathan,et al.  A concurrent abstract interpreter , 1994, LISP Symb. Comput..

[11]  Matthew Might,et al.  EigenCFA: accelerating flow analysis with GPUs , 2011, POPL '11.

[12]  David Monniaux,et al.  The Parallel Implementation of the Astrée Static Analyzer , 2005, APLAS.

[13]  François Bourdoncle,et al.  Efficient chaotic iteration strategies with widenings , 1993, Formal Methods in Programming and Their Applications.

[14]  Susan Horwitz,et al.  The Effects of the Precision of Pointer Analysis , 1997, SAS.

[15]  Gerald J. Sussman,et al.  Structure and Interpretation of Computer Programs, Second Edition , 1996 .

[16]  Matthew Might,et al.  Optimizing abstract abstract machines , 2012, ICFP.

[17]  Murali Krishna Ramanathan,et al.  Scalable and incremental software bug detection , 2013, ESEC/FSE 2013.

[18]  Christian Bird,et al.  What developers want and need from program analysis: An empirical study , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[19]  Michael Eichberg,et al.  A programming model for semi-implicit parallelization of static analyses , 2020, ISSTA.

[20]  Welf Löwe,et al.  Parallel points-to analysis for multi-core machines , 2011, HiPEAC.

[21]  Ben Hardekopf,et al.  JSAI: Designing a Sound, Configurable, and Efficient Static Analyzer for JavaScript , 2014, ArXiv.

[22]  Aditya V. Thakur,et al.  Deterministic parallel fixpoint computation , 2020, Proc. ACM Program. Lang..

[23]  Coen De Roover,et al.  Mailbox Abstractions for Static Analysis of Actor Programs , 2017, ECOOP.

[24]  Barbara G. Ryder,et al.  A comprehensive approach to parallel data flow analysis , 1992, ICS '92.

[25]  Aws Albarghouthi,et al.  Parallelizing top-down interprocedural analyses , 2012, PLDI '12.

[26]  Gary A. Kildall,et al.  A unified approach to global program optimization , 1973, POPL.

[27]  Ciera Jaspan,et al.  Lessons from building static analysis tools at Google , 2018, Commun. ACM.

[28]  Guillaume Brat,et al.  Precise and efficient static array bound checking for large embedded C programs , 2004, PLDI '04.

[29]  Thomas W. Reps,et al.  Precise interprocedural dataflow analysis via graph reachability , 1995, POPL '95.

[30]  Keshav Pingali,et al.  Parallel inclusion-based points-to analysis , 2010, OOPSLA.

[31]  Keshav Pingali,et al.  A GPU implementation of inclusion-based points-to analysis , 2012, PPoPP '12.

[32]  Matthew Might,et al.  Abstract allocation as a unified approach to polyvariance in control-flow analyses , 2018, J. Funct. Program..

[33]  Patrick Cousot,et al.  Why does Astrée scale up? , 2009, Formal Methods Syst. Des..

[34]  Ondrej Lhoták,et al.  Actor-Based Parallel Dataflow Analysis , 2011, CC.

[35]  Ben Hardekopf,et al.  A parallel abstract interpreter for JavaScript , 2015, 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[36]  Alexander Aiken,et al.  Saturn: A scalable framework for error detection using Boolean satisfiability , 2007, TOPL.

[37]  Patrick Cousot,et al.  The ASTREÉ Analyzer , 2005, ESOP.