PreTrans: Reducing TLB CAM-search via page number prediction and speculative pre-translation
暂无分享,去创建一个
[1] Mikko H. Lipasti,et al. Exceeding the dataflow limit via value prediction , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.
[2] Somayeh Sardashti,et al. The gem5 simulator , 2011, CARN.
[3] Scott A. Mahlke,et al. Data access microarchitectures for superscalar processors with compiler-assisted data prefetching , 1991, MICRO 24.
[4] Tomás Lang,et al. Reducing TLB power requirements , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.
[5] John L. Henning. SPEC CPU2006 benchmark descriptions , 2006, CARN.
[6] Mahmut T. Kandemir,et al. Generating physical addresses directly for saving instruction TLB energy , 2002, MICRO.
[7] James R. Goodman. Coherency for multiprocessor virtual address caches , 1987, ASPLOS 1987.
[8] Michel Cekleov,et al. Virtual-address caches. Part 1: problems and solutions in uniprocessors , 1997, IEEE Micro.
[9] Alan L. Cox,et al. SpecTLB: A mechanism for speculative address translation , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[10] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[11] Avi Mendelson,et al. Using value prediction to increase the power of speculative execution hardware , 1998, TOCS.
[12] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[13] Jean-Loup Baer,et al. A performance study of software and hardware data prefetching schemes , 1994, ISCA '94.
[14] W. H. Wang,et al. Organization and performance of a two-level virtual-real cache hierarchy , 1989, ISCA '89.
[15] Babak Falsafi,et al. Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.
[16] Todd M. Austin,et al. Zero-cycle loads: microarchitecture support for reducing load latency , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[17] Vicki H. Allan,et al. Petri net versus module scheduling for software pipelining , 1995, MICRO 1995.
[18] No License,et al. Intel ® 64 and IA-32 Architectures Software Developer ’ s Manual Volume 3 A : System Programming Guide , Part 1 , 2006 .
[19] Dionisios N. Pnevmatikatos,et al. Streamlining data cache access with fast address calculation , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[20] Ken Kennedy,et al. Software prefetching , 1991, ASPLOS IV.
[21] Michael M. Swift,et al. Reducing memory reference energy with opportunistic virtual caching , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).