Preliminary evaluation of a binary translation system for multithreaded processors

Thread level parallelism (TLP) is a key technology for next-generation high performance processors. Although it provides higher processing capability, the loss of compatibility with existing processors is a crucial issue. This research is motivated by the following two points: (1) TLP requires multithread programming which is rather difficult for ordinary programmers, or complex compilation technologies that can exploit multithread parallelism, and (2) existing binary codes should be executed efficiently on multithreaded processors. In this paper, we first propose a binary translation system, that translates existing binary codes to multithreaded ones and optimizes them dynamically during execution. The system inputs the original binary codes and translates them to internal RTL representation. It analyzes the structure of the program and applies multithreading to loop bodies in a thread pipelining manner. A pilot binary translator, that is a part of the proposed system, was built for the sake of preliminary evaluation. Evaluation results illustrate effectiveness of the system.