HOIST: a system for automatically deriving static analyzers for embedded systems

Embedded software must meet conflicting requirements such as be-ing highly reliable, running on resource-constrained platforms, and being developed rapidly. Static program analysis can help meet all of these goals. People developing analyzers for embedded object code face a difficult problem: writing an abstract version of each instruction in the target architecture(s). This is currently done by hand, resulting in abstract operations that are both buggy and im-precise. We have developed Hoist: a novel system that solves these problems by automatically constructing abstract operations using a microprocessor (or simulator) as its own specification. With almost no input from a human, Hoist generates a collection of C func-tions that are ready to be linked into an abstract interpreter. We demonstrate that Hoist generates abstract operations that are cor-rect, having been extensively tested, sufficiently fast, and substan-tially more precise than manually written abstract operations. Hoist is currently limited to eight-bit machines due to costs exponential in the word size of the target architecture. It is essential to be able to analyze software running on these small processors: they are important and ubiquitous, with many embedded and safety-critical systems being based on them.

[1]  Jørn Lind-Nielsen,et al.  BuDDy : A binary decision diagram package. , 1999 .

[2]  Somesh Jha,et al.  Static Analysis of Executables to Detect Malicious Patterns , 2003, USENIX Security Symposium.

[3]  G. A. Venkatesh A framework for construction and evaluation of high-level specifications for program analysis techniques , 1989, PLDI '89.

[4]  Thomas W. Reps,et al.  Symbolic Implementation of the Best Transformer , 2004, VMCAI.

[5]  Robert Wahbe,et al.  Efficient software-based fault isolation , 1994, SOSP '93.

[6]  Daniel Kästner PROPAN: A Retargetable System for Postpass Optimisations and Analyses , 2000, LCTES.

[7]  Kwangkeun Yi,et al.  Automatic generation and management of interprocedural program analyses , 1993, POPL '93.

[8]  Saumya K. Debray,et al.  Obfuscation of executable code to improve resistance to static disassembly , 2003, CCS '03.

[9]  Xavier Rival,et al.  Symbolic transfer function-based approaches to certified compilation , 2004, POPL.

[10]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[11]  Robert Szewczyk,et al.  System architecture directions for networked sensors , 2000, ASPLOS IX.

[12]  Jan Gustafsson,et al.  Worst-case execution-time analysis for embedded real-time systems , 2003, International Journal on Software Tools for Technology Transfer.

[13]  Jakob Engblom Static properties of commercial embedded real-time programs, and their implication for worst-case execution time analysis , 1999, Proceedings of the Fifth IEEE Real-Time Technology and Applications Symposium.

[14]  John Regehr,et al.  Eliminating stack overflow by abstract interpretation , 2003, TECS.

[15]  Saumya K. Debray,et al.  Alias analysis of executable code , 1998, POPL '98.

[16]  Jack W. Davidson,et al.  Machine Descriptions to Build Tools for Embedded Systems , 1998, LCTES.

[17]  Christian S. Collberg,et al.  Reverse interpretation + mutation analysis = automatic retargeting , 1997, PLDI '97.

[18]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[19]  Barton P. Miller,et al.  Safety checking of machine code , 2000, PLDI '00.

[20]  Thomas W. Reps,et al.  Analyzing Memory Accesses in x86 Executables , 2004, CC.

[21]  Cristina Cifuentes,et al.  Interprocedural data flow decompilation , 1996, J. Program. Lang..

[22]  Herbert Klaeren,et al.  Eliminating range checks using static single assignment form , 1996 .

[23]  Jens Palsberg,et al.  Static checking of interrupt-driven software , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[24]  Thomas R. Gross,et al.  Postpass Code Optimization of Pipeline Constraints , 1983, TOPL.

[25]  Per Stenström,et al.  An Integrated Path and Timing Analysis Method based on Cycle-Level Symbolic Execution , 1999, Real-Time Systems.

[26]  Lori A. Clarke,et al.  A flexible architecture for building data flow analyzers , 1995, Proceedings of IEEE 18th International Conference on Software Engineering.

[27]  Nicolas Halbwachs,et al.  Automatic discovery of linear restraints among variables of a program , 1978, POPL.

[28]  Philippe Granger,et al.  Improving the Results of Static Analyses Programs by Local Decreasing Iteration , 1992, FSTTCS.

[29]  Antoine Miné,et al.  The octagon abstract domain , 2001, High. Order Symb. Comput..

[30]  Dawson R. Engler,et al.  Reverse-Engineering Instruction Encodings , 2001, USENIX Annual Technical Conference, General Track.

[31]  Mark Stephenson,et al.  Bidwidth analysis with application to silicon compilation , 2000, PLDI '00.

[32]  Olivier Coudert,et al.  Implicit and incremental computation of primes and essential primes of Boolean functions , 1992, [1992] Proceedings 29th ACM/IEEE Design Automation Conference.

[33]  Michael D. Smith,et al.  A high-performance microarchitecture with hardware-programmable functional units , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.

[34]  Antoine Mid The Octagon Abstract Domain , 2001 .

[35]  Thomas W. Reps,et al.  Symbolically Computing Most-Precise Abstract Operations for Shape Analysis , 2004, TACAS.

[36]  Sorin Lerner,et al.  Composing dataflow analyses and transformations , 2002, POPL '02.