Enhancing L2 organization for CMPs with a center cell
暂无分享,去创建一个
[1] Yu Cao,et al. New paradigm of predictive MOSFET and interconnect modeling for early circuit simulation , 2000, Proceedings of the IEEE 2000 Custom Integrated Circuits Conference (Cat. No.00CH37044).
[2] Jaehyuk Huh,et al. TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP , 2004, TACO.
[3] Balaram Sinharoy,et al. Design and implementation of the POWER5 microprocessor , 2004, Proceedings. 41st Design Automation Conference, 2004..
[4] Ravi R. Iyer,et al. CQoS: a framework for enabling QoS in shared caches of CMP platforms , 2004, ICS '04.
[5] David A. Wood,et al. TLC: Transmission Line Caches , 2003, MICRO.
[6] Krste Asanovic,et al. Victim replication: maximizing capacity while hiding wire delay in tiled chip multiprocessors , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[7] Fredrik Larsson,et al. Simics: A Full System Simulation Platform , 2002, Computer.
[8] Jaehyuk Huh,et al. A NUCA Substrate for Flexible CMP Cache Sharing , 2007, IEEE Transactions on Parallel and Distributed Systems.
[9] Zeshan Chishti,et al. Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache Architectures , 2003, MICRO.
[10] Cameron McNairy,et al. Itanium 2 Processor Microarchitecture , 2003, IEEE Micro.
[11] Zeshan Chishti,et al. Optimizing replication, communication, and capacity allocation in CMPs , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[12] S. Kim,et al. Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[13] Rudolf Eigenmann,et al. SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance , 2001, WOMPAT.
[14] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[15] David A. Wood,et al. Managing Wire Delay in Large Chip-Multiprocessor Caches , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[16] Norman P. Jouppi,et al. CACTI: an enhanced cache access and cycle time model , 1996, IEEE J. Solid State Circuits.
[17] G. Edward Suh,et al. Dynamic Partitioning of Shared Cache Memory , 2004, The Journal of Supercomputing.
[18] Paul Barford,et al. Generating representative Web workloads for network and server performance evaluation , 1998, SIGMETRICS '98/PERFORMANCE '98.
[19] Doug Matzke,et al. Will Physical Scalability Sabotage Performance Gains? , 1997, Computer.
[20] Doug Burger,et al. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches , 2002, ASPLOS X.
[21] David H. Bailey,et al. The NAS parallel benchmarks summary and preliminary results , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[22] Kunle Olukotun,et al. Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.