Dynamic binary code translation for data prefetch optimization

Recently, CPUs with an identical ISA tend to have different microarchitectures, different computation resources, and special instructions. To achieve efficient program execution on such hardware, compilers have machine-dependent code optimization. However, software vendors cannot adopt this optimization for software production, since the software would be widely distributed and therefore it must be executable on any machine with the same ISA. On the other hand, there is a significant gap between processorpsilas operational speed and memory access speed, and currently the gap is increasing. In this paper, we introduce several special prefetch instructions that are suited for memory access patterns that frequently appear in program execution. However, such special instructions are utilized only by compilerpsilas machine-dependent code optimization, and therefore software vendors do not utilize such instructions. To increase opportunities for effectively exploiting the instructions for optimization, we propose dynamic optimization techniques that consist of dynamic code modification and analysis methods of memory references. We evaluate the techniques by using SPEC2000 benchmarks.