Temporal instruction fetch streaming
暂无分享,去创建一个
Thomas F. Wenisch | Babak Falsafi | Anastasia Ailamaki | Andreas Moshovos | Michael Ferdman | Andreas Moshovos | A. Ailamaki | B. Falsafi | T. Wenisch | M. Ferdman
[1] Kenneth A. Ross,et al. Buffering databse operations for enhanced instruction cache performance , 2004, SIGMOD '04.
[2] Ian H. Witten,et al. Identifying Hierarchical Structure in Sequences: A linear-time algorithm , 1997, J. Artif. Intell. Res..
[3] Santosh G. Abraham,et al. Effective instruction prefetching in chip multiprocessors for modern commercial applications , 2005, 11th International Symposium on High-Performance Computer Architecture.
[4] Glenn Reinman,et al. Fetch directed instruction prefetching , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[5] Yuan Chou,et al. Low-Cost Epoch-Based Correlation Prefetching for Commercial Applications , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[6] David W. Anderson,et al. The IBM System/360 model 91: machine philosophy and instruction-handling , 1967 .
[7] Babak Falsafi,et al. Last-Touch Correlated Data Streaming , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.
[8] Craig Zilles,et al. Execution-based prediction using speculative slices , 2001, ISCA 2001.
[9] Trevor N. Mudge,et al. Instruction prefetching using branch prediction information , 1997, Proceedings International Conference on Computer Design VLSI in Computers and Processors.
[10] David J. DeWitt,et al. DBMSs on a Modern Processor: Where Does Time Go? , 1999, VLDB.
[11] Anastasia Ailamaki,et al. STEPS towards Cache-resident Transaction Processing , 2004, VLDB.
[12] Jignesh M. Patel,et al. Call graph prefetching for database applications , 2003, TOCS.
[13] Alan Jay Smith,et al. Sequential Program Prefetching in Memory Hierarchies , 1978, Computer.
[14] Thomas F. Wenisch,et al. Temporal streaming of shared memory , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[15] David A. Patterson,et al. Performance characterization of a Quad Pentium Pro SMP using OLTP workloads , 1998, ISCA.
[16] Glenn Reinman,et al. Optimizations Enabled by a Decoupled Front-End Architecture , 2001, IEEE Trans. Computers.
[17] Eric Rotenberg,et al. Trace cache: a low latency approach to high bandwidth instruction fetching , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.
[18] Todd C. Mowry,et al. Cooperative prefetching: compiler and hardware support for effective instruction prefetching in modern processors , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[19] Trishul M. Chilimbi. Efficient representations and abstractions for quantifying and exploiting data reference locality , 2001, PLDI '01.
[20] Mateo Valero,et al. Enlarging Instruction Streams , 2007, IEEE Transactions on Computers.
[21] Thomas F. Wenisch,et al. Temporal memory streaming , 2007 .
[22] James E. Smith,et al. Data Cache Prefetching Using a Global History Buffer , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).
[23] Mateo Valero,et al. Fetching instruction streams , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[24] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[25] Babak Falsafi,et al. Database Servers on Chip Multiprocessors: Limitations and Opportunities , 2007, CIDR.
[26] Josep Torrellas,et al. Using a user-level memory thread for correlation prefetching , 2002, ISCA.
[27] Onur Mutlu,et al. Runahead Execution: An Effective Alternative to Large Instruction Windows , 2003, IEEE Micro.
[28] K. Sundaramoorthy,et al. Slipstream processors: improving both performance and fault tolerance , 2000, SIGP.
[29] Joseph A. Fisher,et al. Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.
[30] R. Stets,et al. A detailed comparison of two transaction processing workloads , 2002, 2002 IEEE International Workshop on Workload Characterization.
[31] Susan J. Eggers,et al. An analysis of database workload performance on simultaneous multithreaded processors , 1998, ISCA.
[32] Gary S. Tyson,et al. Branch history guided instruction prefetching , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[33] Thomas F. Wenisch,et al. Temporal streams in commercial server applications , 2008, 2008 IEEE International Symposium on Workload Characterization.
[34] Chi-Keung Luk,et al. Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.
[35] J. Larus. Whole program paths , 1999, PLDI '99.
[36] Onur Mutlu,et al. Software-Based Online Detection of Hardware Defects Mechanisms, Architectural Support, and Evaluation , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[37] Babak Falsafi,et al. Predictor virtualization , 2008, ASPLOS.
[38] Thomas F. Wenisch,et al. SimFlex: Statistical Sampling of Computer System Simulation , 2006, IEEE Micro.
[39] Brad Calder,et al. Predictor-directed stream buffers , 2000, MICRO 33.