Thread Owned Block Cache: Managing Latency in Many-Core Architecture
暂无分享,去创建一个
[1] G. Edward Suh,et al. Dynamic Partitioning of Shared Cache Memory , 2004, The Journal of Supercomputing.
[2] Yale N. Patt,et al. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[3] Doug Burger,et al. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches , 2002, ASPLOS X.
[4] Gu-Yeon Wei,et al. Process Variation Tolerant 3T1D-Based Cache Architectures , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[5] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[6] Jichuan Chang,et al. Cooperative Caching for Chip Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[7] Moinuddin K. Qureshi. Adaptive Spill-Receive for robust high-performance caching in CMPs , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[8] Norman P. Jouppi,et al. Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[9] Mary Jane Irwin,et al. A novel migration-based NUCA design for chip multiprocessors , 2008, HiPC 2008.
[10] Kunle Olukotun,et al. The case for a single-chip multiprocessor , 1996, ASPLOS VII.
[11] Lei Liu,et al. Godson-T: An Efficient Many-Core Architecture for Parallel Program Executions , 2009, Journal of Computer Science and Technology.
[12] Gregory F. Pfister,et al. “Hot spot” contention and combining in multistage interconnection networks , 1985, IEEE Transactions on Computers.
[13] Henry Hoffmann,et al. Evaluation of the Raw microprocessor: an exposed-wire-delay architecture for ILP and streams , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[14] José E. Moreira,et al. Dissecting Cyclops: a detailed analysis of a multithreaded architecture , 2003, CARN.
[15] José González,et al. The design and performance of a conflict-avoiding cache , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[16] Michael Zhang,et al. Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches , 2005 .
[17] Jaehyuk Huh,et al. A NUCA substrate for flexible CMP cache sharing , 2005, ICS.
[18] Jichuan Chang,et al. Cooperative cache partitioning for chip multiprocessors , 2007, ICS '07.
[19] Kunle Olukotun,et al. Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.
[20] Wen Gao,et al. Exploiting the kernel trick to correlate fragment ions for peptide identification via tandem mass spectrometry , 2004, Bioinform..
[21] S. Kim,et al. Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[22] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[23] Dean M. Tullsen,et al. Runtime identification of cache conflict misses: The adaptive miss buffer , 2001, TOCS.
[24] Babak Falsafi,et al. R-NUCA: Data Placement in Distributed Shared Caches , 2009 .
[25] Mahmut T. Kandemir,et al. Adaptive set pinning: managing shared caches in chip multiprocessors , 2008, ASPLOS.
[26] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.
[27] Chuanjun Zhang. Balanced Cache: Reducing Conflict Misses of Direct-Mapped Caches , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[28] Song Feng. An Implicitly Dynamic Shared Cache Isolation in Many-Core Architecture , 2009 .