暂无分享,去创建一个
[1] Onur Mutlu,et al. Improving cache performance using read-write partitioning , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[2] T. Hamamoto,et al. On the retention time distribution of dynamic random access memory (DRAM) , 1998 .
[3] Onur Mutlu,et al. ERRoR ANAlysIs AND RETENTIoN-AwARE ERRoR MANAgEMENT FoR NAND FlAsh MEMoRy , 2013 .
[4] Brent Keeth,et al. DRAM Circuit Design: A Tutorial , 2000 .
[5] Kiyoung Choi,et al. A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[6] D. Yaney,et al. A meta-stable leakage phenomenon in DRAM charge storage —Variable hold time , 1987, 1987 International Electron Devices Meeting.
[7] Kieran McLaughlin,et al. An RLDRAM II Implementation of a 10Gbps Shared Packet Buffer for Network Processing , 2007, Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007).
[8] Xin Li,et al. A Memory Soft Error Measurement on Production Systems , 2007, USENIX Annual Technical Conference.
[9] Onur Mutlu,et al. Address-value delta (AVD) prediction: increasing the effectiveness of runahead execution by exploiting regular memory allocation patterns , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[10] Vijayalakshmi Srinivasan,et al. Enhancing lifetime and security of PCM-based Main Memory with Start-Gap Wear Leveling , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[11] Jae-Kyung Wee,et al. A post-package bit-repair scheme using static latches with bipolar-voltage programmable antifuse circuit for high-density DRAMs , 2002 .
[12] Bong-Seok Han,et al. Adaptive Self Refresh Scheme for Battery Operated High-Density Mobile DRAM Applications , 2006, 2006 IEEE Asian Solid-State Circuits Conference.
[13] Norman P. Jouppi,et al. Single-ISA heterogeneous multi-core architectures: the potential for processor power reduction , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[14] Shimeng Yu,et al. Metal–Oxide RRAM , 2012, Proceedings of the IEEE.
[15] Onur Mutlu,et al. Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems , 2008, 2008 International Symposium on Computer Architecture.
[16] Zhao Zhang,et al. Software thermal management of dram memory for multicore systems , 2008, SIGMETRICS '08.
[17] Onur Mutlu,et al. Prefetch-aware shared-resource management for multi-core systems , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[18] Gennady Pekhimenko,et al. Toggle-Aware Bandwidth Compression for GPUs , 2015 .
[19] Chita R. Das,et al. A heterogeneous multiple network-on-chip design: An application-aware approach , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).
[20] Rachata Ausavarungnirun,et al. Row buffer locality aware caching policies for hybrid memories , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).
[21] Trevor Mudge,et al. Improving data cache performance by pre-executing instructions under a cache miss , 1997 .
[22] A. Snavely,et al. Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.
[23] Mahmut T. Kandemir,et al. Managing GPU Concurrency in Heterogeneous Architectures , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[24] Y. Mori,et al. The origin of variable retention time in DRAM , 2005, IEEE InternationalElectron Devices Meeting, 2005. IEDM Technical Digest..
[25] David A. Wood,et al. Adaptive cache compression for high-performance processors , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[26] O Seongil,et al. Reducing memory access latency with asymmetric DRAM bank organizations , 2013, ISCA.
[27] John Paul Shen,et al. Mitigating Amdahl's law through EPI throttling , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[28] Aamer Jaleel,et al. High performance cache replacement using re-reference interval prediction (RRIP) , 2010, ISCA.
[29] Anna R. Karlin,et al. A study of integrated prefetching and caching strategies , 1995, SIGMETRICS '95/PERFORMANCE '95.
[30] Carlos Alvarez Martinez,et al. Dynamic Tolerance Region Computing for Multimedia , 2012, IEEE Transactions on Computers.
[31] Jian Huang,et al. Exploiting basic block value locality with block reuse , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.
[32] Fred Douglis,et al. The Compression Cache: Using On-line Compression to Extend Physical Memory , 1993, USENIX Winter.
[33] Onur Mutlu,et al. ChargeCache: Reducing DRAM latency by exploiting row access locality , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[34] Onur Mutlu,et al. Low-Cost Inter-Linked Subarrays (LISA): Enabling fast inter-subarray data movement in DRAM , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[35] James E. Smith,et al. The predictability of data values , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[36] Feng Lin,et al. DRAM circuit design , 2000 .
[37] Young-Hyun Jun,et al. 1.2V 1.6Gb/s 56nm 6F2 4Gb DDR3 SDRAM with hybrid-I/O sense amplifier and segmented sub-array architecture , 2009, 2009 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.
[38] Youyou Lu,et al. Loose-Ordering Consistency for persistent memory , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).
[39] Yale N. Patt,et al. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[40] Rajeev Balasubramonian,et al. Interconnect design considerations for large NUCA caches , 2007, ISCA '07.
[41] Kinam Kim,et al. A New Investigation of Data Retention Time in Truly Nanoscaled DRAMs , 2009, IEEE Electron Device Letters.
[42] Kevin Kai-Wei Chang,et al. Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[43] Peter M. Kogge,et al. EXECUBE-A New Architecture for Scaleable MPPs , 1994, 1994 International Conference on Parallel Processing Vol. 1.
[44] Osman S. Unsal,et al. Flash correct-and-refresh: Retention-aware error management for increased flash memory lifetime , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).
[45] Onur Mutlu,et al. A case for exploiting subarray-level parallelism (SALP) in DRAM , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[46] Onur Mutlu,et al. DRAM-Aware Last-Level Cache Writeback: Reducing Write-Caused Interference in Memory Systems , 2010 .
[47] Wen-mei W. Hwu,et al. Run-Time Cache Bypassing , 1999, IEEE Trans. Computers.
[48] Jae-Kyung Wee,et al. An antifuse EPROM circuitry scheme for field-programmable repair in DRAM , 2000, IEEE Journal of Solid-State Circuits.
[49] Chita R. Das,et al. Application-aware prioritization mechanisms for on-chip networks , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[50] Jun Yang,et al. Phase-Change Technology and the Future of Main Memory , 2010, IEEE Micro.
[51] Shih-Hung Chen,et al. Phase-change random access memory: A scalable technology , 2008, IBM J. Res. Dev..
[52] Zhao Zhang,et al. Cached DRAM for ILP Processor Memory Access Latency Reduction , 2001, IEEE Micro.
[53] Norbert Wehn,et al. Embedded DRAM Development: Technology, Physical Design, and Application Issues , 2001, IEEE Des. Test Comput..
[54] Mikko H. Lipasti,et al. Value locality and load value prediction , 1996, ASPLOS VII.
[55] Chia-Lin Yang,et al. Improving DRAM latency with dynamic asymmetric subarray , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[56] Onur Mutlu,et al. A Case for MLP-Aware Cache Replacement , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[57] Yang Xiao,et al. Low power memristor-based ReRAM design with Error Correcting Code , 2012, 17th Asia and South Pacific Design Automation Conference.
[58] Onur Mutlu,et al. MISE: Providing performance predictability and improving fairness in shared main memory systems , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[59] Carlos Alvarez,et al. On the potential of tolerant region reuse for multimedia applications , 2001, ICS '01.
[60] Gabriel H. Loh,et al. 3D-Stacked Memory Architectures for Multi-core Processors , 2008, 2008 International Symposium on Computer Architecture.
[61] Jongmoo Choi,et al. WARM: Improving NAND flash memory lifetime with write-hotness aware retention management , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).
[62] Bianca Schroeder,et al. Cosmic rays don't strike twice: understanding the nature of DRAM errors and the implications for system design , 2012, ASPLOS XVII.
[63] G.S. Sohi,et al. Dynamic instruction reuse , 1997, ISCA '97.
[64] R.-L. Jiang,et al. Guardband determination for the detection of off-state and junction leakages in DRAM testing , 2001, Proceedings 10th Asian Test Symposium.
[65] James E. Smith,et al. Performance Of Cached Dram Organizations In Vector Supercomputers , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[66] Onur Mutlu,et al. The application slowdown model: Quantifying and controlling the impact of inter-application interference at shared caches and main memory , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[67] Onur Mutlu,et al. Improving DRAM performance by parallelizing refreshes with accesses , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[68] Christoforos E. Kozyrakis,et al. A case for intelligent RAM , 1997, IEEE Micro.
[69] Onur Mutlu,et al. Improving memory Bank-Level Parallelism in the presence of prefetching , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[70] Yannis Smaragdakis,et al. The Case for Compressed Caching in Virtual Memory Systems , 1999, USENIX Annual Technical Conference, General Track.
[71] Jong-Ho Kang,et al. A 1.2V 23nm 6F2 4Gb DDR3 SDRAM with local-bitline sense amplifier, hybrid LIO sense amplifier and dummy-less array architecture , 2012, 2012 IEEE International Solid-State Circuits Conference.
[72] Tajana Simunic,et al. PDRAM: A hybrid PRAM and DRAM main memory system , 2009, 2009 46th ACM/IEEE Design Automation Conference.
[73] Ricardo Bianchini,et al. Page placement in hybrid memory systems , 2011, ICS '11.
[74] Onur Mutlu,et al. The Blacklisting Memory Scheduler: Achieving high performance and fairness at low cost , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).
[75] Onur Mutlu,et al. The efficacy of error mitigation techniques for DRAM retention failures: a comparative experimental study , 2014, SIGMETRICS '14.
[76] Tze Meng Low,et al. 3 D-Stacked Memory-Side Acceleration : Accelerator and System Design , 2014 .
[77] Wongyu Shin,et al. Multiple Clone Row DRAM: A low latency and area optimized DRAM , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[78] Onur Mutlu,et al. Rollback-free value prediction with approximate loads , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[79] Richard Veras,et al. RAIDR: Retention-aware intelligent DRAM refresh , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[80] Frederick A. Ware,et al. Improving Power and Data Efficiency with Threaded Memory Modules , 2006, 2006 International Conference on Computer Design.
[81] Chita R. Das,et al. Aérgia: exploiting packet latency slack in on-chip networks , 2010, ISCA.
[82] Bruce R. Childers,et al. COMeT: Continuous Online Memory Test , 2011, 2011 IEEE 17th Pacific Rim International Symposium on Dependable Computing.
[83] Onur Mutlu,et al. A QoS-Enabled On-Die Interconnect Fabric for Kilo-Node Chips , 2012, IEEE Micro.
[84] Onur Mutlu,et al. Program interference in MLC NAND flash memory: Characterization, modeling, and mitigation , 2013, ICCD.
[85] Rachata Ausavarungnirun,et al. RowClone: Fast and energy-efficient in-DRAM bulk data copy and initialization , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[86] Kiyoung Choi,et al. PIM-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[87] Mahmut T. Kandemir,et al. Evaluating STT-RAM as an energy-efficient main memory alternative , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[88] Costas J. Spanos,et al. Modeling within-die spatial correlation effects for process-design co-optimization , 2005, Sixth international symposium on quality electronic design (isqed'05).
[89] Onur Mutlu,et al. The Blacklisting Memory Scheduler: Balancing Performance, Fairness and Complexity , 2015, ArXiv.
[90] Sai Prashanth Muralidhara,et al. Reducing memory interference in multicore systems via application-aware memory channel partitioning , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[91] Wen-mei W. Hwu,et al. Compiler-directed dynamic computation reuse: rationale and initial results , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[92] Sherif M. Sharroush,et al. Dynamic random-access memories without sense amplifiers , 2012, Elektrotech. Informationstechnik.
[93] S. Stall,et al. Improving Cache Performance by Exploiting Read-Write Disparity , 2014 .
[94] Ad J. van de Goor,et al. Address and data scrambling: causes and impact on memory tests , 2002, Proceedings First IEEE International Workshop on Electronic Design, Test and Applications '2002.
[95] Rajeev Balasubramonian,et al. MemZip: Exploring unconventional benefits from memory compression , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[96] Sudhanva Gurumurthi,et al. Feng Shui of supercomputer memory positional effects in DRAM and SRAM faults , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[97] Reetuparna Das,et al. Design and Evaluation of Hierarchical Rings with Deflection Routing , 2014, 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing.
[98] Michel Dubois,et al. Sequential Hardware Prefetching in Shared-Memory Multiprocessors , 1995, IEEE Trans. Parallel Distributed Syst..
[99] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[100] Zhen Fang,et al. Leveraging Heterogeneity in DRAM Main Memories to Accelerate Critical Word Access , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[101] J. W. Park,et al. DRAM variable retention time , 1992, 1992 International Technical Digest on Electron Devices Meeting.
[102] Dongwoo Kang,et al. Amnesic cache management for non-volatile memory , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).
[103] Shuai Li,et al. High-Performance and Lightweight Transaction Support in Flash-Based SSDs , 2015, IEEE Transactions on Computers.
[104] Harold S. Stone,et al. A Logic-in-Memory Computer , 1970, IEEE Transactions on Computers.
[105] Scott A. Mahlke,et al. Composite Cores: Pushing Heterogeneity Into a Core , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[106] Onur Mutlu,et al. Prefetch-Aware DRAM Controllers , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[107] Stijn Eyerman,et al. System-Level Performance Metrics for Multiprogram Workloads , 2008, IEEE Micro.
[108] Amandeep Singh,et al. Software based in-system memory test for highly available systems , 2005, 2005 IEEE International Workshop on Memory Technology, Design, and Testing (MTDT'05).
[109] Hubertus Franke,et al. Memory Expansion Technology (MXT): Software support and performance , 2001, IBM J. Res. Dev..
[110] Onur Mutlu,et al. Runahead execution: an alternative to very large instruction windows for out-of-order processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[111] Onur Mutlu,et al. A Case for Effic ient Hardware/Soft ware Cooperative Management of Storage and Memory , 2013 .
[112] Onur Mutlu,et al. Memory scaling: A systems architecture perspective , 2013, 2013 5th IEEE International Memory Workshop.
[113] Wongyu Shin,et al. NUAT: A non-uniform access time memory controller , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[114] Mahmut T. Kandemir,et al. Exploiting Inter-Warp Heterogeneity to Improve GPGPU Performance , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[115] Osman S. Unsal,et al. Neighbor-cell assisted error correction for MLC NAND flash memories , 2014, SIGMETRICS '14.
[116] Onur Mutlu,et al. Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[117] Charles A. Hart. CDRAM in a unified memory architecture , 1994, Proceedings of COMPCON '94.
[118] J. Torrellas,et al. VARIUS: A Model of Process Variation and Resulting Timing Errors for Microarchitects , 2008, IEEE Transactions on Semiconductor Manufacturing.
[119] Onur Mutlu,et al. Self-Optimizing Memory Controllers: A Reinforcement Learning Approach , 2008, 2008 International Symposium on Computer Architecture.
[120] Onur Mutlu,et al. RFVP: Rollback-Free Value Prediction with Safe-to-Approximate Loads , 2016, ACM Trans. Archit. Code Optim..
[121] Jing Li,et al. A case for small row buffers in non-volatile main memories , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).
[122] Song Liu,et al. Hardware/software techniques for DRAM thermal management , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[123] Onur Mutlu,et al. Base-Delta-Immediate Compression: A Practical Data Compression Mechanism for On-Chip Caches , 2012 .
[124] Onur Mutlu,et al. Adaptive-latency DRAM: Optimizing DRAM timing for the common-case , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[125] James E. Smith,et al. Fair Queuing Memory Systems , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[126] Rajiv Kapoor,et al. Pinpointing Representative Portions of Large Intel® Itanium® Programs with Dynamic Instrumentation , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[127] Tao Zhang,et al. Half-DRAM: A high-bandwidth and low-power DRAM architecture from the rethinking of fine-grained activation , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[128] Onur Mutlu,et al. Memory Performance Attacks: Denial of Memory Service in Multi-Core Systems , 2007, USENIX Security Symposium.
[129] Anoop Gupta,et al. Operating system support for improving data locality on CC-NUMA compute servers , 1996, ASPLOS VII.
[130] Chris Fallin,et al. Memory power management via dynamic voltage/frequency scaling , 2011, ICAC '11.
[131] Onur Mutlu,et al. Accelerating read mapping with FastHASH , 2013, BMC Genomics.
[132] Reetuparna Das,et al. Application-to-core mapping policies to reduce memory system interference in multi-core systems , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[133] Larry Rudolph,et al. Accelerating multi-media processing by implementing memoing in multiplication and division units , 1998, ASPLOS VIII.
[134] Onur Mutlu,et al. PARBOR: An Efficient System-Level Technique to Detect Data-Dependent Failures in DRAM , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).
[135] Hideto Hidaka,et al. The cache DRAM architecture: a DRAM with an on-chip cache memory , 1990, IEEE Micro.
[136] Jinseok Lee,et al. Bit line coupling scheme and electrical fuse circuit for reliable operation of high density DRAM , 2001, 2001 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.01CH37185).
[137] Jung Ho Ahn,et al. NDA: Near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[138] Alair Pereira do Lago,et al. Adaptive compressed caching: design and implementation , 2003, Proceedings. 15th Symposium on Computer Architecture and High Performance Computing.
[139] Onur Mutlu,et al. Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching systems , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[140] David A. Wood,et al. Interactions Between Compression and Prefetching in Chip Multiprocessors , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[141] Chris Fallin,et al. Next generation on-chip networks: what kind of congestion control do we need? , 2010, Hotnets-IX.
[142] Roy T. Fielding,et al. The Apache HTTP Server Project , 1997, IEEE Internet Comput..
[143] Onur Mutlu,et al. Coordinated control of multiple prefetchers in multi-core systems , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[144] Onur Mutlu,et al. An experimental study of data retention behavior in modern DRAM devices: implications for retention time profiling mechanisms , 2013, ISCA.
[145] Onur Mutlu,et al. Threshold voltage distribution in MLC NAND flash memory: Characterization, analysis, and modeling , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[146] Onur Mutlu,et al. Exploiting compressed block size as an indicator of future reuse , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[147] Thomas F. Wenisch,et al. Memory persistency , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[148] Onur Mutlu,et al. Simultaneous Multi-Layer Access , 2016, ACM Trans. Archit. Code Optim..
[149] Jongmoo Choi,et al. ThyNVM: Enabling software-transparent crash consistency in persistent memory systems , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[150] Onur Mutlu,et al. Runahead Execution: An Effective Alternative to Large Instruction Windows , 2003, IEEE Micro.
[151] O. Mutlu,et al. Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems , 2010, ASPLOS XV.
[152] J. E. Thornton,et al. Parallel operation in the control data 6600 , 1964, AFIPS '64 (Fall, part II).
[153] S. Phadke,et al. MLP aware heterogeneous memory system , 2011, 2011 Design, Automation & Test in Europe.
[154] Huiyang Zhou,et al. Enhancing memory-level parallelism via recovery-free value prediction , 2005, IEEE Transactions on Computers.
[155] Onur Mutlu,et al. AVATAR: A Variable-Retention-Time (VRT) Aware Refresh for DRAM Systems , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.
[156] Aamer Jaleel,et al. Adaptive insertion policies for high performance caching , 2007, ISCA '07.
[157] Andrew A. Chien,et al. The future of microprocessors , 2011, Commun. ACM.
[158] Mor Harchol-Balter,et al. Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[159] Onur Mutlu,et al. The Dirty-Block Index , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[160] David W. Nellans,et al. Micro-pages: increasing DRAM efficiency with locality-aware data placement , 2010, ASPLOS XV.
[161] Zhao Zhang,et al. Mini-rank: Adaptive DRAM architecture for improving memory power efficiency , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[162] Onur Mutlu,et al. Phase change memory architecture and the quest for scalability , 2010, Commun. ACM.
[163] Srinivasan Seshan,et al. On-chip networks from a networking perspective: congestion and scalability in many-core interconnects , 2012, SIGCOMM '12.
[164] Vijayalakshmi Srinivasan,et al. Scalable high performance main memory system using phase-change memory technology , 2009, ISCA '09.
[165] A. Ueno,et al. A 16-Mbit DRAM with a relaxed sense-amplifier-pitch open-bit-line architecture , 1988 .
[166] Anoop Gupta,et al. Scheduling and page migration for multiprocessor compute servers , 1994, ASPLOS VI.
[167] Thomas Vogelsang,et al. Understanding the Energy Consumption of Dynamic Random Access Memories , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[168] Onur Mutlu,et al. Data retention in MLC NAND flash memory: Characterization, optimization, and recovery , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[169] Jie Liu,et al. Characterizing Application Memory Error Vulnerability to Optimize Datacenter Cost via Heterogeneous-Reliability Memory , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.
[170] Onur Mutlu,et al. Enabling Efficient and Scalable Hybrid Memories Using Fine-Granularity DRAM Cache Management , 2012, IEEE Computer Architecture Letters.
[171] André Seznec,et al. Zero-content augmented caches , 2009, ICS '09.
[172] Onur Mutlu,et al. Kilo-NOC: A heterogeneous network-on-chip architecture for scalability and service guarantees , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[173] Onur Mutlu,et al. The evicted-address filter: A unified mechanism to address both cache pollution and thrashing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[174] Onur Mutlu,et al. Linearly compressed pages: A low-complexity, low-latency main memory compression framework , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[175] Meng-Fan Chang,et al. Low Store Energy, Low VDDmin, 8T2R Nonvolatile Latch and SRAM With Vertical-Stacked Resistive Memory (Memristor) Devices for Low Power Mobile Applications , 2012, IEEE Journal of Solid-State Circuits.
[176] Qiang Wu,et al. Revisiting Memory Errors in Large-Scale Production Data Centers: Analysis and Modeling of New Trends from the Field , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.
[177] Chris Fallin,et al. Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[178] Jin Sun,et al. Managing Hybrid Main Memories with a Page-Utility Driven Performance Model , 2015, ArXiv.
[179] Chris Fallin,et al. Parallel application memory scheduling , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[180] Jung Ho Ahn,et al. A Comprehensive Memory Modeling Tool and Its Application to the Design and Analysis of Future Memory Hierarchies , 2008, 2008 International Symposium on Computer Architecture.
[181] Hongzhong Zheng,et al. Co-Architecting Controllers and DRAM to Enhance DRAM Process Scaling , 2014 .
[182] Onur Mutlu,et al. Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.
[183] Onur Mutlu,et al. Accelerating critical section execution with asymmetric multi-core architectures , 2009, ASPLOS.
[184] Chris Fallin,et al. The heterogeneous block architecture , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).
[185] Onur Mutlu,et al. Research Problems and Opportunities in Memory Systems , 2014, Supercomput. Front. Innov..
[186] Onur Mutlu,et al. Gather-Scatter DRAM: In-DRAM address translation to improve the spatial locality of non-unit strided accesses , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[187] Onur Mutlu,et al. Address-Value Delta (AVD) Prediction: A Hardware Technique for Efficiently Parallelizing Dependent Cache Misses , 2006, IEEE Transactions on Computers.
[188] William J. Dally,et al. Memory access scheduling , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[189] Onur Mutlu,et al. Mitigating the Memory Bottleneck With Approximate Load Value Prediction , 2016, IEEE Design & Test.
[190] Onur Mutlu,et al. Bottleneck identification and scheduling in multithreaded applications , 2012, ASPLOS XVII.
[191] Onur Mutlu,et al. Accelerating Dependent Cache Misses with an Enhanced Memory Controller , 2016, ISCA.
[192] Onur Mutlu,et al. Ramulator: A Fast and Extensible DRAM Simulator , 2016, IEEE Computer Architecture Letters.
[193] Richard C. Foss,et al. High-speed, high-reliability circuit design for megabit DRAM , 1991 .
[194] Sally A. McKee,et al. Hitting the memory wall: implications of the obvious , 1995, CARN.
[195] Onur Mutlu,et al. Page overlays: An enhanced virtual memory framework to enable fine-grained memory management , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[196] Brian Rogers,et al. Scaling the bandwidth wall: challenges in and avenues for CMP scaling , 2009, ISCA '09.
[197] Feng Lin,et al. DRAM Circuit Design: Fundamental and High-Speed Topics , 2007 .
[198] Onur Mutlu,et al. Efficient Runahead Execution: Power-Efficient Memory Latency Tolerance , 2006, IEEE Micro.
[199] J.Y. Lee,et al. Simultaneously formed storage node contact and metal contact cell (SSMC) for 1 Gb DRAM and beyond , 1996, International Electron Devices Meeting. Technical Digest.
[200] Christoforos E. Kozyrakis,et al. Practical Near-Data Processing for In-Memory Analytics Frameworks , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[201] Hiroyuki Kobayashi,et al. Fast cycle RAM (FCRAM); a 20-ns random row access, pipe-lined operating DRAM , 1998, 1998 Symposium on VLSI Circuits. Digest of Technical Papers (Cat. No.98CH36215).
[202] D. Tavangarian,et al. Automatic on-line memory tests in workstations , 1994, Proceedings of IEEE International Workshop on Memory Technology, Design, and Test.
[203] Franziska Hoffmann,et al. Design Of Analog Cmos Integrated Circuits , 2016 .
[204] Jun Yang,et al. Frequent Value Locality and Value-Centric Data Cache Design , 2000, ASPLOS.
[205] Kevin Kai-Wei Chang,et al. DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators , 2016, ACM Trans. Archit. Code Optim..
[206] Jung Ho Ahn,et al. Multicore DIMM: an Energy Efficient Memory Module with Independently Controlled DRAMs , 2009, IEEE Computer Architecture Letters.
[207] Onur Mutlu,et al. Distributed order scheduling and its application to multi-core dram controllers , 2008, PODC '08.
[208] Xiang Li,et al. Thermal managerment of high power memory module for server platforms , 2008, 2008 11th Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems.
[209] Onur Mutlu,et al. Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[210] Onur Mutlu,et al. FIRM: Fair and High-Performance Memory Control for Persistent Memory Systems , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[211] M. Togo,et al. 64 Mb 6.8 ns random ROW access DRAM macro for ASICs , 1999, 1999 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC. First Edition (Cat. No.99CH36278).
[212] André Seznec,et al. Exploiting Single-Usage for Effective Memory Management , 2007, Asia-Pacific Computer Systems Architecture Conference.
[213] Hiroki Koike,et al. A 30-ns 64-Mb DRAM with built-in self-test and self-repair function , 1992 .
[214] Onur Mutlu,et al. Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[215] Mor Harchol-Balter,et al. ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.
[216] Onur Mutlu,et al. Tiered-latency DRAM: A low latency and low cost DRAM architecture , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[217] Onur Mutlu,et al. Utility-based acceleration of multithreaded applications on asymmetric CMPs , 2013, ISCA.
[218] K. Gotoh,et al. Technique for controlling effective Vth in multi-Gbit DRAM sense amplifier , 1996, 1996 Symposium on VLSI Circuits. Digest of Technical Papers.
[219] Onur Mutlu,et al. Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization , 2016, SIGMETRICS.
[220] Maurice V. Wilkes,et al. The memory gap and the future of high performance memories , 2001, CARN.
[221] S. Narasimha,et al. 22nm High-performance SOI technology featuring dual-embedded stressors, Epi-Plate High-K deep-trench embedded DRAM and self-aligned Via 15LM BEOL , 2012, 2012 International Electron Devices Meeting.
[222] Mahmut T. Kandemir,et al. Exploiting Core Criticality for Enhanced GPU Performance , 2016, SIGMETRICS.
[223] David J. DeWitt,et al. DBMSs on a Modern Processor: Where Does Time Go? , 1999, VLDB.
[224] Jesús Corbal,et al. Dynamic Tolerance Region Computing for Multimedia , 2012, IEEE Trans. Computers.
[225] K.J. Nesbit,et al. AC/DC: an adaptive data cache prefetcher , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[226] Onur Mutlu,et al. Techniques for efficient processing in runahead execution engines , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[227] Kevin Kai-Wei Chang,et al. HAT: Heterogeneous Adaptive Throttling for On-Chip Networks , 2012, 2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing.
[228] H.-S. Philip Wong,et al. Phase Change Memory , 2010, Proceedings of the IEEE.
[229] R. Symanczyk,et al. Conductive bridging RAM (CBRAM): an emerging non-volatile memory technology scalable to sub 20nm , 2005, IEEE InternationalElectron Devices Meeting, 2005. IEDM Technical Digest..
[230] Mithuna Thottethodi,et al. Self-tuned congestion control for multiprocessor networks , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[231] Bianca Schroeder,et al. A Large-Scale Study of Failures in High-Performance Computing Systems , 2010, IEEE Trans. Dependable Secur. Comput..
[232] Qiang Wu,et al. A Large-Scale Study of Flash Memory Failures in the Field , 2015, SIGMETRICS 2015.
[233] David A. Patterson,et al. Latency lags bandwith , 2004, CACM.
[234] Dirk Grunwald,et al. A stateless, content-directed data prefetching mechanism , 2002, ASPLOS X.
[235] Onur Mutlu,et al. Data marshaling for multi-core architectures , 2010, ISCA.
[236] Onur Mutlu,et al. Fast Bulk Bitwise AND and OR in DRAM , 2015, IEEE Computer Architecture Letters.
[237] Dean M. Tullsen,et al. Hardware identification of cache conflict misses , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[238] B J Smith,et al. A pipelined, shared resource MIMD computer , 1986 .
[239] Doris Schmitt-Landsiedel,et al. DRAM Yield Analysis and Optimization by a Statistical Design Approach , 2011, IEEE Transactions on Circuits and Systems I: Regular Papers.
[240] Onur Mutlu,et al. Preemptive Virtual Clock: A flexible, efficient, and cost-effective QOS scheme for networks-on-chip , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[241] Onur Mutlu,et al. Express Cube Topologies for on-Chip Interconnects , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[242] Vilas Sridharan,et al. A study of DRAM failures in the field , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[243] Andrew F. Glew. MLP yes! ILP no , 1998, ASPLOS 1998.
[244] Bianca Schroeder,et al. Temperature management in data centers: why some (might) like it hot , 2012, SIGMETRICS '12.
[245] Kevin Zhang,et al. 2nd generation embedded DRAM with 4X lower self refresh power in 22nm Tri-Gate CMOS technology , 2014, 2014 Symposium on VLSI Circuits Digest of Technical Papers.
[246] Rei-Fu Huang,et al. Fault models for embedded-DRAM macros , 2009, 2009 46th ACM/IEEE Design Automation Conference.
[247] Norbert Wehn,et al. Exploiting expendable process-margins in DRAMs for run-time performance optimization , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[248] Onur Mutlu,et al. Mitigating Prefetcher-Caused Pollution Using Informed Caching Policies for Prefetched Blocks , 2014, ACM Trans. Archit. Code Optim..
[249] Jim Zelenka,et al. Informed prefetching and caching , 1995, SOSP.
[250] Jose-Maria Arnau,et al. Eliminating redundant fragment shader executions on a mobile GPU via hardware memoization , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[251] Reetuparna Das,et al. A case for hierarchical rings with deflection routing: An energy-efficient on-chip communication substrate , 2016, Parallel Comput..
[252] Jongmoo Choi,et al. Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[253] C. Alkan,et al. Fast and accurate mapping of Complete Genomics reads. , 2015, Methods.
[254] Onur Mutlu,et al. Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.
[255] Onur Mutlu,et al. Efficient Data Mapping and Buffering Techniques for Multilevel Cell Phase-Change Memories , 2014, ACM Trans. Archit. Code Optim..
[256] David A. Wood,et al. Frequent Pattern Compression: A Significance-Based Compression Scheme for L2 Caches , 2004 .
[257] S. Narasimha,et al. High Performance 45-nm SOI Technology with Enhanced Strain, Porous Low-k BEOL, and Immersion Lithography , 2006, 2006 International Electron Devices Meeting.
[258] Karthik Ramani,et al. Microarchitectural wire management for performance and power in partitioned architectures , 2005, 11th International Symposium on High-Performance Computer Architecture.
[259] Mike Ignatowski,et al. TOP-PIM: throughput-oriented programmable processing in memory , 2014, HPDC '14.
[260] Christoforos E. Kozyrakis,et al. Improving System Energy Efficiency with Memory Rank Subsetting , 2012, TACO.