Tolerating late memory traps in ILP processors
暂无分享,去创建一个
ILP processors can execute a large number of instructions at the same time. Thus it becomes more and more difficult to support traps efficiently. On the other hand a current trend in architecture is to support various memory functions in software rather than hardware, usually by trapping the execution processor on a cache miss, TLB miss or a failed access to a local or remote memory. These late memory traps block the faulty instruction at the top of the active list, backing up the pipeline. Moreover the support for late memory traps may affect the performance of non-faulting memory instructions as well. In this paper we analyze the overhead caused by late memory traps in ILP processors and define several measures for this overhead. In order to tolerate late memory traps, we propose hardware prefetching of exception conditions and a tagged store buffer to implement deferred traps on stores. We show that, with these hardware optimizations, the overhead added by the lateness of traps is significantly reduced relative to the overhead of early traps. Because of caching effects the frequency of late memory traps usually decreases as they are taken deeper in the memory hierarchy and their overall impact on the execution time becomes negligible.
[1] Andrew W. Appel,et al. Virtual memory primitives for user programs , 1991, ASPLOS IV.
[2] Anant Agarwal,et al. Software-extended coherent shared memory: performance and cost , 1994, ISCA '94.
[3] Michel Cekleov,et al. Virtual-address caches. Part 1: problems and solutions in uniprocessors , 1997, IEEE Micro.
[4] Adrian Moga,et al. Hardware vs. Software Implementation of COMA , 1997 .
[5] Michel Dubois,et al. VIRTUAL-ADDRESS CACHES , 1997 .