Exploring Predictive Replacement Policies for Instruction Cache and Branch Target Buffer
暂无分享,去创建一个
Daniel A. Jiménez | Samira Mirbagher Ajorpaz | Elba Garza | Sangam Jindal | Elba Garza | Sangam Jindal
[1] Yale N. Patt,et al. A comprehensive instruction fetch mechanism for a processor supporting speculative execution , 1992, MICRO 25.
[2] James R. Goodman,et al. Instruction Cache Replacement Policies and Organizations , 1985, IEEE Transactions on Computers.
[3] Scott A. Mahlke,et al. EFetch: Optimizing instruction fetch for event-driven web applications , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[4] Babak Falsafi,et al. Using dead blocks as a virtual victim cache , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[5] Ulrich Mayer,et al. Two level bulk preload branch prediction , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[6] Wen-mei W. Hwu,et al. Run-time Adaptive Cache Hierarchy Via Reference Analysis , 1997, ISCA 1997.
[7] Chau-Wen Tseng,et al. Compiler optimizations for improving data locality , 1994, ASPLOS VI.
[8] Thomas A. Ziaja,et al. Sparc T4: A Dynamically Threaded Server-on-a-Chip , 2012, IEEE Micro.
[9] Henry G. Dietz,et al. Improving cache performance by selective cache bypass , 1989, [1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track.
[10] Mahmut T. Kandemir,et al. Leakage energy management in cache hierarchies , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.
[11] Babak Falsafi,et al. Dead-block prediction & dead-block correlating prefetchers , 2001, ISCA 2001.
[12] Daniel A. Jiménez,et al. The impact of delay on the design of branch predictors , 2000, MICRO 33.
[13] Alan Jay Smith,et al. Branch Prediction Strategies and Branch Target Buffer Design , 1995, Computer.
[14] S. McFarling. Combining Branch Predictors , 1993 .
[15] Michael F. P. O'Boyle,et al. IATAC: a smart predictor to turn-off L2 cache lines , 2005, TACO.
[16] Kevin Skadron,et al. Merging path and gshare indexing in perceptron branch prediction , 2005, TACO.
[17] Babak Falsafi,et al. Proactive instruction fetch , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[18] B. Fagin,et al. Partial resolution in branch target buffers , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[19] Arnold L. Rosenberg,et al. Using the compiler to improve cache replacement decisions , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.
[20] Roland N. Ibbett,et al. An Analysis of Instruction-Fetching Strategies in Pipelined Computers , 1980, IEEE Transactions on Computers.
[21] Samira Manabi Khan,et al. Sampling Dead Block Prediction for Last-Level Caches , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[22] Edward S. Davidson,et al. Reducing conflicts in direct-mapped caches with a temporality-based design , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.
[23] Wen-mei W. Hwu,et al. Run-Time Cache Bypassing , 1999, IEEE Trans. Computers.
[24] Chyi-Chang Miao,et al. Compiler managed micro-cache bypassing for high performance EPIC processors , 2002, MICRO.
[25] Margaret Martonosi,et al. Speculative Updates of Local and Global Branch History: A Quantitative Analysis , 2000, J. Instr. Level Parallelism.
[26] Hideki Ando,et al. A Cost-Effective Branch Target Buffer with a Two-Level Table Organization , 1999 .
[27] Ioana Burcea,et al. Phantom-BTB: a virtualized branch target buffer design , 2009, ASPLOS.
[28] Aamer Jaleel,et al. High performance cache replacement using re-reference interval prediction (RRIP) , 2010, ISCA.
[29] Margaret Martonosi,et al. Timekeeping in the memory system: predicting and optimizing memory behavior , 2002, ISCA.
[30] David A. Wood,et al. Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[31] Daniel A. Jiménez,et al. Fast Path-Based Neural Branch Prediction , 2003, MICRO.
[32] James E. Smith,et al. A study of branch prediction strategies , 1981, ISCA '98.
[33] Barry S. Fagin,et al. Partial resolution in branch target buffers , 1995, MICRO.
[34] Brad Burgess. Samsung exynos M1 processor , 2016, 2016 IEEE Hot Chips 28 Symposium (HCS).
[35] Margaret Martonosi,et al. Cache decay: exploiting generational behavior to reduce cache leakage power , 2001, ISCA 2001.
[36] Per Stenström,et al. Enhancing Last-Level Cache Performance by Block Bypassing and Early Miss Determination , 2006, Asia-Pacific Computer Systems Architecture Conference.
[37] Mateo Valero,et al. Fetching instruction streams , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[38] Per Stenström,et al. A novel approach to cache block reuse predictions , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..
[39] Kathryn S. McKinley,et al. Cooperative caching with keep-me and evict-me , 2005, 9th Annual Workshop on Interaction between Compilers and Computer Architectures (INTERACT'05).
[40] Dirk Grunwald,et al. Reducing branch costs via branch alignment , 1994, ASPLOS VI.
[41] Yan Solihin,et al. Counter-Based Cache Replacement and Bypassing Algorithms , 2008, IEEE Transactions on Computers.
[42] Josep Torrellas,et al. Optimizing instruction cache performance for operating system intensive workloads , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.
[43] Wen-mei W. Hwu,et al. Run-time Adaptive Cache Hierarchy Via Reference Analysis , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[44] Babak Falsafi,et al. SHIFT: Shared history instruction fetch for lean-core server processors , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[45] Michael D. Smith,et al. Procedure placement using temporal-ordering information , 1999, TOPL.
[46] Carole-Jean Wu,et al. SHiP: Signature-based Hit Predictor for high performance caching , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[47] Gary S. Tyson,et al. A modified approach to data cache management , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[48] Brad Calder,et al. Efficient procedure mapping using cache line coloring , 1997, PLDI '97.
[49] Jaehyuk Huh,et al. Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[50] Gary S. Tyson,et al. Utilizing reuse information in data cache management , 1998, ICS '98.
[51] Thomas F. Wenisch,et al. RDIP: Return-address-stack Directed Instruction Prefetching , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[52] Babak Falsafi,et al. Confluence: Unified instruction supply for scale-out servers , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[53] Thomas F. Wenisch,et al. Memory coherence activity prediction in commercial workloads , 2004, WMPI '04.
[54] Onur Mutlu,et al. A Case for MLP-Aware Cache Replacement , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[55] Chandra Krintz,et al. Cache-conscious data placement , 1998, ASPLOS VIII.
[56] W. W. Hwu,et al. Achieving high instruction cache performance with an optimizing compiler , 1989, ISCA '89.
[57] Daniel A. Jiménez,et al. Dynamic branch prediction with perceptrons , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[58] Cheng-Chieh Huang,et al. Boomerang: A Metadata-Free Architecture for Control Flow Delivery , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[59] Scott McFarling,et al. Program optimization for instruction caches , 1989, ASPLOS III.
[60] James R. Goodman,et al. The declining effectiveness of dynamic caching for general- purpose microprocessors , 1995 .
[61] Mateo Valero,et al. A Data Cache with Multiple Caching Strategies Tuned to Different Types of Locality , 1995, International Conference on Supercomputing.
[62] Chris H. Perleberg,et al. Branch Target Buffer Design and Optimization , 1993, IEEE Trans. Computers.
[63] Thomas F. Wenisch,et al. Temporal instruction fetch streaming , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[64] Onur Mutlu,et al. Exploiting compressed block size as an indicator of future reuse , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[65] Gary S. Tyson,et al. Active Management of Data Caches by Exploiting Reuse Information , 1999, IEEE Trans. Computers.
[66] Babak Falsafi,et al. Selective, accurate, and timely self-invalidation using last-touch prediction , 2000, ISCA '00.