Improving instruction supply efficiency in superscalar architectures using instruction trace buffers

‘lMs paper addresses the problem of efkknt instruction supply in superscalar architectures. The presence of branch instructions introduces two types of overhead on insuuuion supply efficiency, namely the delay requited toreaolve branch decisions and the fetch penalty due to poor wde alignment. The former can be reduced by conventional branch prediction techniques. The latter, however, has not been well addremd in past research. We propose the instruction trace buffer technique to alleviate the alignment problem. An instruction trace buffer is an aggressive extension of the conventional loop buffer technique. It caches recent instruction traces in a circular buffer to predict branch behavior as well as to improve wde alignment. This approach relies on the fau that dynamic branch behavior is stable in most cases. Application tram m collected to verify this assumption and to evaluate our scheme. The result indicates that instruction trace buffers lead to substantial improvement on instruction supply efficiency for numerical applications. For these applications new perfect (99%) efficiency is achieved without imposing undue demand on the instruction memory bandwidth, which is essential in other hardware based alignment techniques.