SAMO: store aware memory optimizations
暂无分享,去创建一个
[1] Josep Torrellas,et al. Scalable Cache Miss Handling for High Memory-Level Parallelism , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[2] William J. Dally,et al. Memory access scheduling , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[3] John L. Henning. SPEC CPU2006 benchmark descriptions , 2006, CARN.
[4] Thomas F. Wenisch,et al. Mechanisms for store-wait-free multiprocessors , 2007, ISCA '07.
[5] Kevin Kai-Wei Chang,et al. Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[6] Milo M. K. Martin,et al. NoSQ: Store-Load Communication without a Store Queue , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[7] Sai Prashanth Muralidhara,et al. Reducing memory interference in multicore systems via application-aware memory channel partitioning , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[8] Víctor Viñals,et al. Store buffer design in first-level multibanked data caches , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[9] Gabriel H. Loh,et al. Criticality-based optimizations for efficient load processing , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[10] Mikko H. Lipasti,et al. Modern Processor Design: Fundamentals of Superscalar Processors , 2002 .
[11] Gabriel H. Loh,et al. Fire-and-Forget: Load/Store Scheduling with No Store Queue at All , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[12] David A. Patterson,et al. Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .
[13] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[14] Simha Sethumadhavan,et al. Scalable hardware memory disambiguation for high-ILP processors , 2003, IEEE Micro.
[15] Milo M. K. Martin,et al. Scalable store-load forwarding via store queue index prediction , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[16] José F. Martínez,et al. MORSE: Multi-objective reconfigurable self-optimizing memory scheduler , 2012, IEEE International Symposium on High-Performance Comp Architecture.
[17] A. Snavely,et al. Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.
[18] Aamer Jaleel,et al. Using virtual load/store queues (VLSQs) to reduce the negative effects of reordered memory instructions , 2005, 11th International Symposium on High-Performance Computer Architecture.
[19] Chris Fallin,et al. Parallel application memory scheduling , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[20] Santosh G. Abraham,et al. Store memory-level parallelism optimizations for commercial applications , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[21] Somayeh Sardashti,et al. The gem5 simulator , 2011, CARN.
[22] Mor Harchol-Balter,et al. ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.
[23] W. Marsden. I and J , 2012 .
[24] Calvin Lin,et al. Adaptive History-Based Memory Schedulers for Modern Processors , 2006, IEEE Micro.
[25] Onur Mutlu,et al. Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems , 2008, 2008 International Symposium on Computer Architecture.
[26] José F. Martínez,et al. Improving memory scheduling via processor-side load criticality information , 2013, ISCA.