CLR-DRAM: A Low-Cost DRAM Architecture Enabling Dynamic Capacity-Latency Trade-Off
暂无分享,去创建一个
Onur Mutlu | Hasan Hassan | Lois Orosa | Jisung Park | Taha Shahroodi | Minesh Patel | Haocong Luo | Abdullah Giray Yaglikci | O. Mutlu | Minesh Patel | Hasan Hassan | A. G. Yaglikçi | Lois Orosa | Taha Shahroodi | Haocong Luo | Jisung Park | A. G. Yağlıkçı
[1] Rachata Ausavarungnirun,et al. The Virtual Block Interface: A Flexible Alternative to the Conventional Virtual Memory Framework , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
[2] Onur Mutlu,et al. Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization , 2016, SIGMETRICS.
[3] Sally A. McKee,et al. Hitting the memory wall: implications of the obvious , 1995, CARN.
[4] Onur Mutlu,et al. In-DRAM Bulk Bitwise Execution Engine , 2019, ArXiv.
[5] Yuan Xie,et al. ProactiveDRAM: A DRAM-initiated retention management scheme , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).
[6] Hideto Hidaka,et al. The cache DRAM architecture: a DRAM with an on-chip cache memory , 1990, IEEE Micro.
[7] Mor Harchol-Balter,et al. Thread Cluster Memory Scheduling , 2011, IEEE Micro.
[8] Jun Yang,et al. Restore truncation for performance improvement in future DRAM systems , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[9] Onur Mutlu,et al. Detecting and Mitigating Data-Dependent DRAM Failures by Exploiting Current Memory Content , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[10] Vijay Janapa Reddi,et al. PIN: a binary instrumentation tool for computer architecture research and education , 2004, WCAE '04.
[11] Stijn Eyerman,et al. System-Level Performance Metrics for Multiprogram Workloads , 2008, IEEE Micro.
[12] Rachata Ausavarungnirun,et al. Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks , 2018, ASPLOS.
[13] Onur Mutlu,et al. Runahead execution: an alternative to very large instruction windows for out-of-order processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[14] Yang Wang,et al. How Does the Workload Look Like in Production Cloud? Analysis and Clustering of Workloads on Alibaba Cluster Trace , 2018, 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS).
[15] O Seongil,et al. Row-buffer decoupling: A case for low-latency DRAM microarchitecture , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[16] Mayu Aoki,et al. A precise on-chip voltage generator for a gigascale DRAM with a negative word-line scheme , 1999 .
[17] Wongyu Shin,et al. Multiple Clone Row DRAM: A low latency and area optimized DRAM , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[18] Yun Chen,et al. Supporting Differentiated Services in Computers via Programmable Architecture for Resourcing-on-Demand (PARD) , 2015, ASPLOS.
[19] Onur Mutlu,et al. Research Problems and Opportunities in Memory Systems , 2014, Supercomput. Front. Innov..
[20] Onur Mutlu,et al. Processing-in-memory: A workload-driven perspective , 2019, IBM J. Res. Dev..
[21] Maurice V. Wilkes,et al. The memory gap and the future of high performance memories , 2001, CARN.
[22] Onur Mutlu,et al. Understanding Reduced-Voltage Operation in Modern DRAM Devices , 2017, Proc. ACM Meas. Anal. Comput. Syst..
[23] Stefan Mangard,et al. DRAMA: Exploiting DRAM Addressing for Cross-CPU Attacks , 2015, USENIX Security Symposium.
[24] Y. Nakagome,et al. Trends in low-power RAM circuit technologies , 1995 .
[25] Ninghui Sun,et al. Labeled RISC-V: A New Perspective on Software-Defined Architecture , 2017 .
[26] Onur Mutlu,et al. Memory scaling: A systems architecture perspective , 2013, 2013 5th IEEE International Memory Workshop.
[27] Feng Lin,et al. DRAM Circuit Design: Fundamental and High-Speed Topics , 2007 .
[28] Onur Mutlu,et al. CROW: A Low-Cost Substrate for Improving DRAM Performance, Energy Efficiency, and Reliability , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).
[29] Onur Mutlu,et al. ChargeCache: Reducing DRAM latency by exploiting row access locality , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[30] Onur Mutlu,et al. Low-Cost Inter-Linked Subarrays (LISA): Enabling fast inter-subarray data movement in DRAM , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[31] Jongmoo Choi,et al. Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[32] Sheng Di,et al. Characterization and Comparison of Cloud versus Grid Workloads , 2012, 2012 IEEE International Conference on Cluster Computing.
[33] Young-Hyun Jun,et al. Simultaneous Reverse Body and Negative Word-Line Biasing Control Scheme for Leakage Reduction of DRAM , 2011, IEEE Journal of Solid-State Circuits.
[34] Onur Mutlu,et al. The efficacy of error mitigation techniques for DRAM retention failures: a comparative experimental study , 2014, SIGMETRICS '14.
[35] Kiyoo Itoh,et al. Long-Retention-Time, High-Speed DRAM Array with 12-F2 Twin Cell for Sub 1-V Operation , 2007, IEICE Trans. Electron..
[36] Richard Veras,et al. RAIDR: Retention-aware intelligent DRAM refresh , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[37] Hiroyuki Kobayashi,et al. Fast cycle RAM (FCRAM); a 20-ns random row access, pipe-lined operating DRAM , 1998, 1998 Symposium on VLSI Circuits. Digest of Technical Papers (Cat. No.98CH36215).
[38] Nick Knupffer. Intel Corporation , 2018, The Grants Register 2019.
[39] Onur Mutlu,et al. SMASH: Co-designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations , 2019, MICRO.
[40] Onur Mutlu,et al. A case for exploiting subarray-level parallelism (SALP) in DRAM , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[41] Masashi Horiguchi,et al. Nanoscale Memory Repair , 2011, Integrated Circuits and Systems.
[42] Charles Reiss,et al. Towards understanding heterogeneous clouds at scale : Google trace analysis , 2012 .
[43] Onur Mutlu,et al. VRL-DRAM: Improving DRAM Performance via Variable Refresh Latency , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).
[44] Onur Mutlu,et al. Simultaneous Multi-Layer Access , 2016, ACM Trans. Archit. Code Optim..
[45] Onur Mutlu,et al. AVATAR: A Variable-Retention-Time (VRT) Aware Refresh for DRAM Systems , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.
[46] Onur Mutlu,et al. DSPatch: Dual Spatial Pattern Prefetcher , 2019, MICRO.
[47] Kazuaki Murakami,et al. Optimizing the DRAM refresh count for merged DRAM/logic LSIs , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).
[48] Binoy Ravindran,et al. Quantifying Memory Underutilization in HPC Systems and Using it to Improve Performance via Architecture Support , 2019, MICRO.
[49] Kevin K. Chang,et al. Understanding and Improving the Latency of DRAM-Based Memory Systems , 2017, ArXiv.
[50] Wayne H. Wolf,et al. MediaBench II video: Expediting the next generation of video systems research , 2009, Microprocess. Microsystems.
[51] Björn Andersson,et al. Coordinated Bank and Cache Coloring for Temporal Protection of Memory Accesses , 2013, 2013 IEEE 16th International Conference on Computational Science and Engineering.
[52] Onur Mutlu,et al. Memory Performance Attacks: Denial of Memory Service in Multi-Core Systems , 2007, USENIX Security Symposium.
[53] Onur Mutlu,et al. What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study , 2018, SIGMETRICS.
[54] Onur Mutlu,et al. PARBOR: An Efficient System-Level Technique to Detect Data-Dependent Failures in DRAM , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).
[55] Onur Mutlu,et al. Solar-DRAM: Reducing DRAM Access Latency by Exploiting the Variation in Local Bitlines , 2018, 2018 IEEE 36th International Conference on Computer Design (ICCD).
[56] Onur Mutlu,et al. The reach profiler (REAPER): Enabling the mitigation of DRAM retention failures via profiling at aggressive conditions , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[57] O Seongil,et al. Reducing memory access latency with asymmetric DRAM bank organizations , 2013, ISCA.
[58] Onur Mutlu,et al. An experimental study of data retention behavior in modern DRAM devices: implications for retention time profiling mechanisms , 2013, ISCA.
[59] Onur Mutlu,et al. Reducing DRAM Latency via Charge-Level-Aware Look-Ahead Partial Restoration , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[60] Dae-Hyun Kim,et al. ArchShield: architectural framework for assisting DRAM scaling by tolerating high error rates , 2013, ISCA.
[61] Daehyun Kim,et al. ECC-ASPIRIN: An ECC-assisted post-package repair scheme for aging errors in DRAMs , 2016, 2016 IEEE 34th VLSI Test Symposium (VTS).
[62] Onur Mutlu,et al. A Case for Memory Content-Based Detection and Mitigation of Data-Dependent Failures in DRAM , 2017, IEEE Computer Architecture Letters.
[63] Dean M. Tullsen,et al. Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.
[64] Onur Mutlu,et al. Self-Optimizing Memory Controllers: A Reinforcement Learning Approach , 2008, 2008 International Symposium on Computer Architecture.
[65] Brad Calder,et al. SimPoint 3.0: Faster and More Flexible Program Phase Analysis , 2005, J. Instr. Level Parallelism.
[66] Onur Mutlu,et al. Adaptive-latency DRAM: Optimizing DRAM timing for the common-case , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[67] Tao Zhang,et al. Half-DRAM: A high-bandwidth and low-power DRAM architecture from the rethinking of fine-grained activation , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[68] Donghyuk Lee,et al. Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity , 2016, ArXiv.
[69] Gu-Yeon Wei,et al. Profiling a warehouse-scale computer , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[70] Rachata Ausavarungnirun,et al. Reducing DRAM Latency by Exploiting Design-Induced Latency Variation in Modern DRAM Chips , 2016, ArXiv.
[72] Bruce Jacob,et al. Flexible auto-refresh: Enabling scalable and energy-efficient DRAM refresh reductions , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[73] Onur Mutlu,et al. Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[74] Kunle Olukotun,et al. The Future of Microprocessors , 2005, ACM Queue.
[75] Eric Rotenberg,et al. Retention-aware placement in DRAM (RAPID): software methods for quasi-non-volatile DRAM , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..
[76] Rachata Ausavarungnirun,et al. RowClone: Fast and energy-efficient in-DRAM bulk data copy and initialization , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[77] Harish Patil,et al. Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.
[78] Jin-Young Kim,et al. An 8 Gb/s/pin 9.6 ns Row-Cycle 288 Mb Deca-Data Rate SDRAM With an I/O Error Detection Scheme , 2006, IEEE Journal of Solid-State Circuits.
[79] Onur Mutlu,et al. Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[80] David A. Wood,et al. A comparative analysis of microarchitecture effects on CPU and GPU memory system behavior , 2014, 2014 IEEE International Symposium on Workload Characterization (IISWC).
[81] Onur Mutlu,et al. EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM , 2019, MICRO.
[82] Yusuf Leblebici,et al. Subthreshold leakage reduction: A comparative study of SCL and CMOS design , 2009, 2009 IEEE International Symposium on Circuits and Systems.
[83] David Roberts,et al. Binary Star: Coordinated Reliability in Heterogeneous Memory Systems for High Performance and Scalability , 2019, MICRO.
[84] Onur Mutlu,et al. A Case for Richer Cross-Layer Abstractions: Bridging the Semantic Gap with Expressive Memory , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[85] Kiyoo Itoh,et al. A 0.5-V FD-SOI Twin-Cell DRAM with Offset-Free Dynamic-VT Sense Amplifiers , 2006, ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design.
[86] Onur Mutlu,et al. Improving DRAM performance by parallelizing refreshes with accesses , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[87] Song Liu,et al. Flikker: saving DRAM refresh-power through critical data partitioning , 2011, ASPLOS XVI.
[88] Chia-Lin Yang,et al. SECRET: Selective error correction for refresh energy reduction in DRAMs , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).
[89] Mor Harchol-Balter,et al. ATLAS : A Scalable and High-Performance Scheduling Algorithm for Multiple Memory Controllers , 2010 .
[90] Onur Mutlu,et al. Tiered-latency DRAM: A low latency and low cost DRAM architecture , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[91] Onur Mutlu,et al. Demystifying Complex Workload-DRAM Interactions: An Experimental Study , 2019, SIGMETRICS.
[92] Thomas Vogelsang,et al. Understanding the Energy Consumption of Dynamic Random Access Memories , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[93] Kee-Won Kwon,et al. An 8Gb/s/pin 9.6ns Row-Cycle 288Mb Deca-Data Rate SDRAM with an I/O Error-Detection Scheme , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.
[94] Onur Mutlu,et al. Gather-Scatter DRAM: In-DRAM address translation to improve the spatial locality of non-unit strided accesses , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[95] Onur Mutlu,et al. Ramulator: A Fast and Extensible DRAM Simulator , 2016, IEEE Computer Architecture Letters.
[96] Sally A. McKee,et al. DTail: a flexible approach to DRAM refresh management , 2014, ICS '14.