Path-Sensitive Analysis Using Edge Strings

Path sensitivity improves the quality of static analysis by avoiding approximative merging of dataflow facts collected along distinct program paths. Because full path sensitivity has prohibitive cost, it is worthwhile to consider hybrid approaches that provide path sensitivity on selected subsets of paths. In this paper, we consider such a technique based on an edge string, a compact abstraction of a set of static program paths. The edge string es = [e1, e2, . . . , ek], where each ei is an edge label found in a program’s control-flow graph, is used to disambiguate dataflow facts that manifest only on paths in which es occurs as a subsequence. The length of es dictates the tradeoff between precision and analysis cost. Loosely speaking, edge strings are a path-sensitive analog to the notion of call-strings exploited by context-sensitive analyses . We present a formalization of edge strings and discuss optimizations that incorporate additional relevance measures, based on the structure of the controlflow graph, to avoid exploring edge-string paths if no added precision accrues. We also provide a detailed implementation study in the context of the functional SSA intermediate representation used by MLton, a whole-program optimizing compiler for Standard ML. Our results indicate that small edge strings provide the necessary precision to identify infeasible paths for functional programs that leverage complex control and dataflow.

[1]  Rajeev Alur,et al.  A Temporal Logic of Nested Calls and Returns , 2004, TACAS.

[2]  Robin Milner,et al.  Definition of standard ML , 1990 .

[3]  Maria Handjieva,et al.  Refining Static Analyses by Trace-Based Partitioning Using Control Flow , 1998, SAS.

[4]  Daniel Kroening,et al.  A Tool for Checking ANSI-C Programs , 2004, TACAS.

[5]  Sriram Sankaranarayanan,et al.  SLR: Path-Sensitive Analysis through Infeasible-Path Detection and Syntactic Language Refinement , 2008, SAS.

[6]  William R. Harris,et al.  Program analysis via satisfiability modulo path programs , 2010, POPL '10.

[7]  Olin Shivers,et al.  Control-flow analysis of higher-order languages of taming lambda , 1991 .

[8]  David A. Schmidt Data flow analysis is model checking of abstract interpretations , 1998, POPL '98.

[9]  Neil D. Jones,et al.  Program Flow Analysis: Theory and Application , 1981 .

[10]  Luc Maranget,et al.  Optimizing pattern matching , 2001, ICFP '01.

[11]  Rajiv Gupta,et al.  Refining data flow information using infeasible paths , 1997, ESEC '97/FSE-5.

[12]  Joyce L. Vedral,et al.  Functional Programming Languages and Computer Architecture , 1989, Lecture Notes in Computer Science.

[13]  Barry K. Rosen,et al.  Qualified Data Flow Problems , 1980, IEEE Transactions on Software Engineering.

[14]  Sorin Lerner,et al.  ESP: path-sensitive program verification in polynomial time , 2002, PLDI '02.

[15]  李幼升,et al.  Ph , 1989 .

[16]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[17]  Lennart Augustsson,et al.  Compiling Pattern Matching , 1985, FPCA.

[18]  Xiangyu Zhang,et al.  Precise Calling Context Encoding , 2010, IEEE Transactions on Software Engineering.

[19]  Xavier Rival,et al.  Trace Partitioning in Abstract Interpretation Based Static Analyzers , 2005, ESOP.

[20]  Anna Philippou,et al.  Tools and Algorithms for the Construction and Analysis of Systems , 2018, Lecture Notes in Computer Science.