Assembly to high-level language translation

Translation of assembly code to high-level language code is of importance in the maintenance of legacy code, as well as in the areas of program understanding, porting, and recovery of code. We present techniques used in the asm2c translator, a SPARC assembly to C translator. The techniques involve data and control flow analyses. The data flow analysis eliminates machine dependencies from the assembly code and recovers high-level language expressions. The control flow analysis recovers control structure statements. Simple data type recovery is also done. The presented techniques are extensions and improvements on previously developed CISC techniques. The choice of intermediate representation allows for both RISC and CISC assembly code to be supported by the analyses. We tested asm2c against SPEC95 SPARC assembly programs generated by a C compiler. Results using both unoptimized and optimized assembly code are presented.