An optimized 3D-stacked memory architecture by exploiting excessive, high-density TSV bandwidth
暂无分享,去创建一个
[1] Luca P. Carloni,et al. Photonic NoCs: System-Level Design Exploration , 2009, IEEE Micro.
[2] Kunle Olukotun,et al. Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.
[3] Josep Torrellas,et al. Share Data Placement Optimizations to Reduce Multiprocessor Cache Miss Rates , 1990, ICPP.
[4] Norman P. Jouppi,et al. WRL Research Report 93/5: An Enhanced Access and Cycle Time Model for On-chip Caches , 1994 .
[5] Rajeev Balasubramonian,et al. Optimizing communication and capacity in a 3D stacked reconfigurable cache hierarchy , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[6] Jun Yang,et al. A low-radix and low-diameter 3D interconnection network design , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[7] Alan Jay Smith,et al. Line (Block) Size Choice for CPU Cache Memories , 1987, IEEE Transactions on Computers.
[8] Edward T. Grochowski,et al. Larrabee: A many-Core x86 architecture for visual computing , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[9] Alan Jay Smith,et al. Sequential Program Prefetching in Memory Hierarchies , 1978, Computer.
[10] Kathryn S. McKinley,et al. Guided region prefetching: a cooperative hardware/software approach , 2003, ISCA '03.
[11] J. ContiC.,et al. Structural aspects of the system/360 model 85 , 1968 .
[12] Erik Lindholm,et al. NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.
[13] Sanjay J. Patel,et al. Rigel: an architecture and scalable programming interface for a 1000-core accelerator , 2009, ISCA '09.
[14] Hsien-Hsin S. Lee,et al. Architectural evaluation of 3D stacked RRAM caches , 2009, 2009 IEEE International Conference on 3D System Integration.
[15] James E. Smith,et al. Data Cache Prefetching Using a Global History Buffer , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).
[16] Jean-Loup Baer,et al. Two techniques for improving performance on bus-based multiprocessors , 1995, Future Gener. Comput. Syst..
[17] Per Stenström,et al. TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors , 2002, ISLPED '02.
[18] Martin Burtscher,et al. Bridging the processor-memory performance gap with 3D IC technology , 2005, IEEE Design & Test of Computers.
[19] Yu Cao,et al. New paradigm of predictive MOSFET and interconnect modeling for early circuit simulation , 2000, Proceedings of the IEEE 2000 Custom Integrated Circuits Conference (Cat. No.00CH37044).
[20] Olivier Temam,et al. MicroLib: A Case for the Quantitative Comparison of Micro-Architecture Mechanisms , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[21] Alan L. Cox,et al. Evaluation of release consistent software distributed shared memory on emerging network technology , 1993, ISCA '93.
[22] K. Kavi. Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance , 2022 .
[23] Tao Li,et al. Microarchitecture soft error vulnerability characterization and mitigation under 3D integration technology , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[24] Coniferous softwood. GENERAL TERMS , 2003 .
[25] Wei-Fen Lin,et al. Reducing DRAM latencies with an integrated memory hierarchy design , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[26] Xiaoxia Wu,et al. Hybrid cache architecture with disparate memory technologies , 2009, ISCA '09.
[27] Steven Przybylski. The performance impact of block sizes and fetch strategies , 1990, ISCA '90.
[28] Lei Jiang,et al. Die Stacking (3D) Microarchitecture , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[29] Andreas Moshovos. RegionScout: exploiting coarse grain sharing in snoop-based coherence , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[30] Kaustav Banerjee,et al. A thermally-aware performance analysis of vertically integrated (3-D) processor-memory hierarchy , 2006, 2006 43rd ACM/IEEE Design Automation Conference.
[31] Livio Ricciulli,et al. The detection and elimination of useless misses in multiprocessors , 1993, ISCA '93.
[32] John S. Liptay,et al. Structural Aspects of the System/360 Model 85 II: The Cache , 1968, IBM Syst. J..
[33] Richard E. Matick,et al. Logic-based eDRAM: Origins and rationale for use , 2005, IBM J. Res. Dev..
[34] Mikko H. Lipasti,et al. Stealth prefetching , 2006, ASPLOS XII.
[35] Yvon Jégou,et al. Using virtual lines to enhance locality exploitation , 1994, ICS '94.
[36] Gabriel H. Loh,et al. 3D-Stacked Memory Architectures for Multi-core Processors , 2008, 2008 International Symposium on Computer Architecture.
[37] Thomas F. Wenisch,et al. Spatial Memory Streaming , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[38] Krisztián Flautner,et al. PicoServer: using 3D stacking technology to enable a compact energy efficient chip multiprocessor , 2006, ASPLOS XII.
[39] Rajeev Balasubramonian,et al. Leveraging 3D Technology for Improved Reliability , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[40] D. Burger,et al. Memory Bandwidth Limitations of Future Microprocessors , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[41] Wen-Hann Wang,et al. Organization And Performance Of A Two-level Virtual-real Cache Hierarchy , 1989, The 16th Annual International Symposium on Computer Architecture.
[42] Yiran Chen,et al. A novel architecture of the 3D stacked MRAM L2 cache for CMPs , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[43] Thomas J. LeBlanc,et al. Adjustable block size coherent caches , 1992, ISCA '92.
[44] Chita R. Das,et al. A novel dimensionally-decomposed router for on-chip communication in 3D architectures , 2007, ISCA '07.
[45] Hsien-Hsin S. Lee,et al. POD: A 3D-Integrated Broad-Purpose Acceleration Layer , 2008, IEEE Micro.
[46] Mahmut T. Kandemir,et al. Design and Management of 3D Chip Multiprocessors Using Network-in-Memory , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[47] Michel Dubois,et al. Cache protocols with partial block invalidations , 1993, [1993] Proceedings Seventh International Parallel Processing Symposium.
[48] Josep Torrellas,et al. False Sharing ans Spatial Locality in Multiprocessor Caches , 1994, IEEE Trans. Computers.
[49] W. H. Wang,et al. Organization and performance of a two-level virtual-real cache hierarchy , 1989, ISCA '89.