Reverse Engineering from Mainframe Assembly to C Codes in Legacy Migration

This paper presents a method of constructing C programs from mainframe assembly programs. IBM mainframe assembly programs, which are called as subroutines from programs written in high-level language such as COBOL, are automatically translated into equivalent C programs. The assembly programs are converted into intermediate representation (IR) of the SSA form on which dataflow analysis, recognition of control structures, and pattern match based transformation are applied to produce codes with readability. Our method features documentation of the translation process. Along with translation, correspondence between the source assembly codes and the resulting C codes are generated as documents, which plays very important role in manually correcting incomplete C codes from architecture dependent codes or self morphing codes. Furthermore, comments in the assembly programs are embedded into appropriate positions in the resulting C programs. A prototype system based on our method successfully translated some assembly codes into C program with function, if, and do-while structures.

[1]  Doug Simon,et al.  Preliminary experience with the use of the UQBT binary translation framework , 1999, PACT 1999.

[2]  Michael Van Emmerik,et al.  Static single assignment for decompilation , 2007 .

[3]  Yishai A. Feldman,et al.  Portability by automatic translation a large-scale case study , 1995, Proceedings 1995 10th Knowledge-Based Software Engineering Conference.

[4]  Kai Chen,et al.  A Refined Decompiler to Generate C Code with High Readability , 2010, 2010 17th Working Conference on Reverse Engineering.

[5]  Martin P. Ward Assembler to C migration using the FermaT transformation system , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[6]  冈崎健 Information processing apparatus, information processing method and program , 2006 .

[7]  Martin P. Ward Assembler restructuring in FermaT , 2013, 2013 IEEE 13th International Working Conference on Source Code Analysis and Manipulation (SCAM).