NoCFG: A Lightweight Approach for Sound Call Graph Approximation

Interprocedural analysis refers to gathering information about the entire program rather than for a single procedure only, as in intraprocedural analysis. Interprocedural analysis enables a more precise analysis; however, it is complicated due to the diculty of constructing an accurate program call graph. Current algorithms for constructing sound and precise call graphs analyze complex program dependencies, therefore they might be dicult to scale. Their complexity stems from the kind of type-inference analysis they use, in particular the use of some variations of points-to analysis. To address this problem, we propose NoCFG, a new sound and scalable method for approximating a call graph that supports a wide variety of programming languages. A key property of NoCFG is that it works on a coarse abstraction of the program, discarding many of the programming language constructs. Due to the coarse program abstraction, extending it to support also other languages is easy. We provide a formal proof for the soundness of NoCFG and evaluations for real-world projects written in both Python and C#. The experimental results demonstrate a high precision rate of 90% (lower bound) and scalability through a security use-case over projects with up to 2 million lines of code.

[1]  Ken Kennedy,et al.  Constructing the Procedure Call Multigraph , 1990, IEEE Trans. Software Eng..

[2]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[3]  Michael Eichberg,et al.  Call graph construction for Java libraries , 2016, SIGSOFT FSE.

[4]  Richard C. Waters,et al.  The programmer's apprentice , 1990, ACM Press frontier series.

[5]  Benjamin Livshits,et al.  Toward full elasticity in distributed static analysis: the case of callgraph analysis , 2017, ESEC/SIGSOFT FSE.

[6]  David Grove,et al.  Fast interprocedural class analysis , 1998, POPL '98.

[7]  Lars Ole Andersen,et al.  Self-applicable C Program Specialization , 1992, PEPM.

[8]  Jens Palsberg,et al.  Object-oriented type inference , 1991, OOPSLA '91.

[9]  Ole Agesen Constraint-Based Type Inference and Parametric Polymorphism , 1994, SAS.

[10]  Peter Thiemann,et al.  Type Analysis for JavaScript , 2009, SAS.

[11]  David Grove,et al.  Optimization of Object-Oriented Programs Using Static Class Hierarchy Analysis , 1995, ECOOP.

[12]  Amitabh Srivastava,et al.  Unreachable procedures in object-oriented programming , 1992, LOPL.

[13]  Toshiaki Yasue,et al.  A study of devirtualization techniques for a Java Just-In-Time compiler , 2000, OOPSLA '00.

[14]  Yugyung Lee,et al.  Code2graph: Automatic Generation of Static Call Graphs for Python Source Code , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[15]  Barbara G. Ryder,et al.  Constructing the Call Graph of a Program , 1979, IEEE Transactions on Software Engineering.

[16]  Ondrej Lhoták,et al.  The Soot framework for Java program analysis: a retrospective , 2011 .

[17]  Angelika Foerster,et al.  Modern Compiler Implementation In Ml , 2016 .

[18]  Michael Eichberg,et al.  Systematic evaluation of the unsoundness of call graph construction algorithms for Java , 2018, ISSTA/ECOOP Workshops.

[19]  Kwang-Moo Choe,et al.  Points-to analysis for JavaScript , 2009, SAC '09.

[20]  Yishai A. Feldman,et al.  Fine Slicing - Theory and Applications for Computation Extraction , 2012, FASE.

[21]  Yishai A. Feldman,et al.  A Parallel On-Demand Algorithm for Computing Interprocedural Dominators , 2014, 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation.

[22]  John K. Ousterhout,et al.  Scripting: Higher-Level Programming for the 21st Century , 1998, Computer.

[23]  Nevin Heintze,et al.  Set-based analysis of ML programs , 1994, LFP '94.

[24]  Li Yu,et al.  Empirical Study of Python Call Graph , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[25]  Frank Tip,et al.  Tool-supported refactoring for JavaScript , 2011, OOPSLA '11.

[26]  Viktor Kuncak,et al.  Runtime Instrumentation for Precise Flow-Sensitive Type Analysis , 2010, RV.

[27]  David F. Bacon,et al.  Fast static analysis of C++ virtual function calls , 1996, OOPSLA '96.

[28]  Ondrej Lhoták,et al.  Application-Only Call Graph Construction , 2012, ECOOP.

[29]  Yishai A. Feldman,et al.  Improving slice accuracy by compression of data and control flow paths , 2009, ESEC/FSE '09.

[30]  Amjed Tahir,et al.  On the Soundness of Call Graph Construction in the Presence of Dynamic Language Features - A Benchmark and Tool Evaluation , 2018, APLAS.

[31]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[32]  Frank Tip,et al.  Correlation Tracking for Points-To Analysis of JavaScript , 2012, ECOOP.

[33]  Jens Palsberg,et al.  Object-oriented type systems , 1994, Wiley professional computing.

[34]  Anders Møller,et al.  Semi-automatic rename refactoring for JavaScript , 2013, OOPSLA.

[35]  Jens Palsberg,et al.  Scalable propagation-based call graph construction algorithms , 2000, OOPSLA '00.

[36]  A Pnueli,et al.  Two Approaches to Interprocedural Data Flow Analysis , 2018 .

[37]  Flemming Nielson,et al.  Semantics with applications - a formal introduction , 1992, Wiley professional computing.

[38]  Bjarne Steensgaard,et al.  Points-to analysis in almost linear time , 1996, POPL '96.

[39]  Agostino Cortesi,et al.  Static Analysis of String Values , 2011, ICFEM.

[40]  Charles Rich A Formal Representation For Plans In The Programmer's Apprentice , 1982, On Conceptual Modelling.

[41]  Olin Shivers,et al.  The semantics of Scheme control-flow analysis , 1991, PEPM '91.

[42]  허진호 [參觀記] European Conference on Object Oriented Programming 참관기 , 1988 .

[43]  Laurie J. Hendren,et al.  Practical virtual method call resolution for Java , 2000, OOPSLA '00.

[44]  Simon Holm Jensen,et al.  Remedying the eval that men do , 2012, ISSTA 2012.

[45]  Frank Tip,et al.  Efficient construction of approximate call graphs for JavaScript IDE services , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[46]  David Grove,et al.  Call graph construction in object-oriented languages , 1997, OOPSLA '97.