The power of partial tanslation: an experiment with the C-ification of binary Prolog

We describe a new language translation framework (partial translation) and the implementation of one of its instances: the C-$cation of Binary Prolog. Partial C-ification is a translation framework which compiles sequences of emulator instructions down to native code (on top of a C compiler). In the case of logic programming languages, their complex control structure, some large instructions, and the management of the symbol table are left to the emulator while the native code chunks will deal with relatively long sequences of simple instructions. The technique can be seen as an automatic specialization with respect to a given program of the traditional instruction folding techniques used to speed-up emulators. When the target language of the host compiler is the same as the implementation language of the emulator (say C), the emulator, the representation of the byte code as a C data structure, some other C-ified library routines and handwritten C-code can all be compiled and linked together to a form a stand-alone application. Communication between the run-time system (still under the control of the emulator) and the C-ified chunks is handled as follows. The emulated code representation of a given program (in particular the compiler itself) is mapped to a C data structure which allows exchange of symbol table information at link time. This is compiled together with the C-code of the emulator to a stand alone executable with performance in the range between pure emulators and native code implementations. We give full details and performance analysis of the Cification of a continuation passing Binary ProIog engine for which large write-mode sequences and the absence of a call-return mechanism make the framework particularly well suited. The method ensures a strong operational equivalence between emulated and translated code which share exactly the same observables in the run-time system. An important characteristic is easy debugging of t,he resulting compiler, coming from the full sharing of the run-time system between emulated and compiled code and the following property we ca.ll instruction-level compositionality: if every translated instruction has the same observable effect on a (small) subset of the program state (registers and a few data areas) in emulated and translated mode, then arbitrary sequences of emulated and translated instructions are operationally equivalent.