Automatically Deriving Control-Flow Graph Generators from Operational Semantics

We develop the first theory of control-flow graphs from first principles, and use it to create an algorithm for automatically synthesizing many variants of control-flow graph generators from a language's operational semantics. Our approach first introduces a new algorithm for converting a large class of small-step operational semantics to an abstract machine. It next uses a technique called "abstract rewriting" to automatically abstract the semantics of a language, which is used both to directly generate a CFG from a program ("interpreted mode") and to generate standalone code, similar to a human-written CFG generator, for any program in a language. We show how the choice of two abstraction and projection parameters allow our approach to synthesize several families of CFG-generators useful for different kinds of tools. We prove the correspondence between the generated graphs and the original semantics. We provide and prove an algorithm for automatically proving the termination of interpreted-mode generators. In addition to our theoretical results, we have implemented this algorithm in a tool called Mandate, and show that it produces human-readable code on two medium-size languages with 60-80 rules, featuring nearly all intraprocedural control constructs common in modern languages. We then showed these CFG-generators were sufficient to build two static analyzers atop them. Our work is a promising step towards the grand vision of being able to synthesize all desired tools from the semantics of a programming language.

[1]  H. R. Walters,et al.  ARM abstract rewriting machine , 1993 .

[2]  Jeremy G. Siek,et al.  Automatically generating the dynamic semantics of gradually typed languages , 2017, POPL.

[3]  Amr Sabry,et al.  The essence of compiling with continuations , 1993, PLDI '93.

[4]  Thomas P. Jensen,et al.  Control-flow analysis of function calls and returns by abstract interpretation , 2009, Inf. Comput..

[5]  Shriram Krishnamurthi,et al.  Inferring scope through syntactic sugar , 2017, Proc. ACM Program. Lang..

[6]  N. Jones Flow Analysis of Lambda Expressions , 1981 .

[7]  Thomas Jensen,et al.  Skeletal semantics and their interpretations , 2018, Proc. ACM Program. Lang..

[8]  Dave Clarke,et al.  From type checking by recursive descent to type checking with an abstract machine , 2011, LDTA.

[9]  David A. Schmidt,et al.  Adapting Big-Step Semantics to Small-Step Style: Coinductive Interpretations and "Higher-Order" Derivations , 1998, HOOTS.

[10]  Olivier Danvy,et al.  Inter-deriving semantic artifacts for object-oriented programming , 2008, J. Comput. Syst. Sci..

[11]  Roberto Giacobazzi,et al.  Compositional analysis of modular logic programs , 1993, POPL '93.

[12]  Mads Sig Ager,et al.  From Natural Semantics to Abstract Machines , 2004, LOPSTR.

[13]  Patrick Cousot,et al.  Abstract Interpretation and Application to Logic Programs , 1992, J. Log. Program..

[14]  Jeremy G. Siek,et al.  The gradualizer: a methodology and algorithm for generating gradual type systems , 2016, POPL.

[15]  Neil D. Jones,et al.  Flow Analysis of Lambda Expressions (Preliminary Version) , 1981, ICALP.

[16]  Chang Liu,et al.  Term rewriting and all that , 2000, SOEN.

[17]  Armando Solar-Lezama,et al.  One tool, many languages: language-parametric transformation with incremental parametric syntax , 2017, ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity.

[18]  Amr Sabry,et al.  From Syntactic Theories to Interpreters: Automating the Proof of Unique Decomposition , 2001, High. Order Symb. Comput..

[19]  Rachid Echahed,et al.  Abstraction of Conditional Term Rewriting Systems , 1995, ILPS.

[20]  David Van Horn,et al.  Abstracting abstract control , 2013, 1305.3163.

[21]  Panagiotis Manolios Mechanical verification of reactive systems , 2001 .

[22]  Corina S. Pasareanu,et al.  Concrete Model Checking with Abstract Matching and Refinement , 2005, CAV.

[23]  David Darais,et al.  Galois transformers and modular abstract interpreters: reusable metatheory for program analysis , 2014, OOPSLA.

[24]  Olivier Danvy,et al.  A functional correspondence between evaluators and abstract machines , 2003, PPDP '03.

[25]  Grigore Rosu,et al.  K-Java , 2015, POPL.

[26]  Matthias Felleisen,et al.  Semantics Engineering with PLT Redex , 2009 .

[27]  Sebastian Erdweg,et al.  IncA: A DSL for the definition of incremental program analyses , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[28]  Matthew Might,et al.  Abstracting abstract machines , 2010, ICFP '10.

[29]  N. A C H U M D E R S H O W I T Z Termination of Rewriting' , 2022 .

[30]  René Thiemann,et al.  Abstract Rewriting , 2010, Arch. Formal Proofs.

[31]  Pearl,et al.  Abstracting Definitional Interpreters Functional , 2018 .

[32]  Juha-Pekka Tolvanen,et al.  DSM 2016 - Proceedings of the International Workshop on Domain-Specific Modeling, co-located with SPLASH 2016: SPLASH '16 Conference on Systems, Programming, Languages, and Applications: Software for Humanity , 2016 .

[33]  Daejun Park,et al.  KJS: a complete formal semantics of JavaScript , 2015, PLDI.

[34]  Chucky Ellison,et al.  Defining the undefinedness of C , 2015, PLDI.

[35]  Shriram Krishnamurthi,et al.  Inferring type rules for syntactic sugar , 2018, Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation.

[36]  Olivier Danvy,et al.  Defunctionalized interpreters for programming languages , 2008, ICFP.

[37]  Dominique Devriese,et al.  Monadic abstract interpreters , 2013, PLDI.

[38]  Armando Solar-Lezama,et al.  QuixBugs: a multi-lingual program repair benchmark set based on the quixey challenge , 2017, SPLASH.

[39]  Jan Midtgaard,et al.  Control-flow analysis of functional programs , 2007, CSUR.

[40]  Andrew W. Appel,et al.  Modern Compiler Implementation in ML , 1997 .

[41]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[42]  Jean-Marie Hullot,et al.  Canonical Forms and Unification , 1980, CADE.

[43]  Andy King,et al.  Abstract Matching Can Improve on Abstract Unification , 1995 .

[44]  Suresh Jagannathan,et al.  A unified treatment of flow analysis in higher-order languages , 1995, POPL '95.

[45]  Lasse R. Nielsen,et al.  Refocusing in Reduction Semantics , 2004 .

[46]  Matthew Might,et al.  Optimizing abstract abstract machines , 2012, ICFP.

[47]  David Darais,et al.  Abstracting definitional interpreters (functional pearl) , 2017, Proc. ACM Program. Lang..

[48]  Tiark Rompf,et al.  Refunctionalization of abstract abstract machines: bridging the gap between abstract abstract machines and abstract definitional interpreters (functional pearl) , 2018, Proc. ACM Program. Lang..

[49]  Olin Shivers,et al.  Control-flow analysis of higher-order languages of taming lambda , 1991 .

[50]  Dale Miller,et al.  From operational semantics to abstract machines , 1992, Mathematical Structures in Computer Science.

[51]  Grigore Rosu,et al.  An overview of the K semantic framework , 2010, J. Log. Algebraic Methods Program..

[52]  Michael R. Clarkson,et al.  Polyglot: An Extensible Compiler Framework for Java , 2003, CC.

[53]  Olivier Danvy,et al.  On inter-deriving small-step and big-step semantics: A case study for storeless call-by-need evaluation , 2012, Theor. Comput. Sci..

[54]  Flemming Nielson,et al.  Principles of Program Analysis , 1999, Springer Berlin Heidelberg.

[55]  Thomas P. Jensen,et al.  A Calculational Approach to Control-Flow Analysis by Abstract Interpretation , 2008, SAS.

[56]  David A. Schmidt Abstract Interpretation of Small-Step Semantics , 1996, LOMAPS.

[57]  Kim Marriott,et al.  Suspension analyses for concurrent logic programs , 1994, TOPL.