ICE: Binary analysis that you can see

Tools for high-level languages often assist developers in successfully comprehending complex systems without worrying about low-level details. However, new architectures and paradigms now pose new challenges in program comprehension that often require high-level reasoning about low-level issues - sometimes even at the level of processor instructions. This is particularly true for the new generation of developers learning to harness the power of SIMD operations, multi-core, multiprocessor systems. Though industrial-strength tools for malware analysts are available, these typically come at considerable cost and require extensive expertise. Our proposed solution is to extend high-level comprehension tools, commonly available in IDEs, to low-level representations. This paper presents the design and prototype implementation of an Integrated Comprehension Environment (ICE), which provides an Eclipse-based tool suite extended to analyse code in intermediate and assembly languages. Preliminary evaluation based on visualisations for wayfinding, call graphs, sequence diagrams and control flow show, (1) correspondence to requirements for comprehension tools in this domain, (2) flexibility in the spectrum of data sources it can accept, and (3) scalability with respect to the explosion of instructions in the code base, while still providing a means to build new visualisations for analysis.

[1]  Robert DeLine,et al.  Staying Oriented with Software Terrain Maps , 2005, DMS.

[2]  Martin Salois,et al.  Progressive User Interfaces for Regressive Analysis: Making Tracks with Large, Low-Level Systems , 2011, AUIC.

[3]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[4]  Joseph J. LaViola,et al.  Code bubbles: rethinking the user interface paradigm of integrated development environments , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[5]  Margaret-Anne Storey,et al.  Working with ‘ Monster ’ Traces : Building a Scalable , Usable Sequence Viewer , 2010 .

[6]  Yvonne Coady,et al.  Social security: collaborative documentation for malware analysis , 2011, CHINZ '11.

[7]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[8]  Lorenzo Martignoni,et al.  Testing CPU emulators , 2009, ISSTA.

[9]  Ken Kennedy,et al.  Constructing the Procedure Call Multigraph , 1990, IEEE Trans. Software Eng..

[10]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[11]  Mary Czerwinski,et al.  Code Thumbnails: Using Spatial Memory to Navigate Source Code , 2006, Visual Languages and Human-Centric Computing (VL/HCC'06).

[12]  Zhenkai Liang,et al.  BitBlaze: A New Approach to Computer Security via Binary Analysis , 2008, ICISS.

[13]  Barbara G. Ryder,et al.  Constructing the Call Graph of a Program , 1979, IEEE Transactions on Software Engineering.