HAM: Hotspot-Aware Manager for Improving Communications With 3D-Stacked Memory
暂无分享,去创建一个
Antonino Tumeo | John D. Leidel | Xi Wang | Yong Chen | Jie Li | Antonino Tumeo | Yong Chen | Jie Li | Xi Wang
[1] David R. Kaeli,et al. Exploiting Memory Access Patterns to Improve Memory Performance in Data-Parallel Architectures , 2011, IEEE Transactions on Parallel and Distributed Systems.
[2] Lizy Kurian John,et al. Minimalist open-page: A DRAM page-mode scheduling policy for the many-core era , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[3] Christoforos E. Kozyrakis,et al. Practical Near-Data Processing for In-Memory Analytics Frameworks , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[4] Srinivas Devadas,et al. IMP: Indirect memory prefetcher , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[5] Margaret Martonosi,et al. TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs , 2013, TACO.
[6] P. Sadayappan,et al. Characterizing and enhancing global memory data coalescing on GPUs , 2015, 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[7] Paul Rosenfeld,et al. Performance Exploration of the Hybrid Memory Cube , 2014 .
[8] Seth H. Pugsley,et al. Perceptron-Based Prefetch Filtering , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).
[9] Alejandro Duran,et al. Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP , 2009, 2009 International Conference on Parallel Processing.
[10] David Roberts,et al. Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[11] Fabio Checconi,et al. A Throughput-Optimized Optical Network for Data-Intensive Computing , 2014, IEEE Micro.
[12] Reena Panda,et al. HALO: A Hierarchical Memory Access Locality Modeling Technique For Memory System Explorations , 2018, ICS.
[13] Ramyad Hadidi,et al. GraphPIM: Enabling Instruction-Level PIM Offloading in Graph Computing Frameworks , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[14] David A. Patterson,et al. The GAP Benchmark Suite , 2015, ArXiv.
[15] Kevin Skadron,et al. Dymaxion: Optimizing memory access patterns for heterogeneous systems , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[16] Sam Ainsworth,et al. Software prefetching for indirect memory accesses , 2017, 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[17] Bahar Asgari,et al. Performance Implications of NoCs on 3D-Stacked Memories: Insights from the Hybrid Memory Cube , 2017, 2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[18] Onur Mutlu,et al. The Blacklisting Memory Scheduler: Balancing Performance, Fairness and Complexity , 2015, ArXiv.
[19] Calvin Lin,et al. Linearizing irregular memory accesses for improved correlated prefetching , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[20] David H. Bailey,et al. NAS parallel benchmark results , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.
[21] Maya Gokhale,et al. Hybrid memory cube performance characterization on data-centric workloads , 2015, IA3@SC.
[22] Rajeev Balasubramonian,et al. Managing DRAM Latency Divergence in Irregular GPGPU Applications , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[23] Zhichun Zhu,et al. CAMPS: Conflict-Aware Memory-Side Prefetching Scheme for Hybrid Memory Cube , 2018, ICPP.
[24] Janak H. Patel,et al. Stride directed prefetching in scalar processors , 1992, MICRO.
[25] Sally A. McKee,et al. Hitting the memory wall: implications of the obvious , 1995, CARN.
[26] Richard W. Vuduc,et al. Many-Thread Aware Prefetching Mechanisms for GPGPU Applications , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[27] U. Brandes. A faster algorithm for betweenness centrality , 2001 .
[28] Rahul Boyapati,et al. Active-Routing: Compute on the Way for Near-Data Processing , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[29] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[30] Marc Casas,et al. Data Prefetching on In-order Processors , 2018, 2018 International Conference on High Performance Computing & Simulation (HPCS).
[31] William J. Dally,et al. Memory access scheduling , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[32] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[33] Hans-Peter Kriegel,et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.
[34] Mahmut T. Kandemir,et al. Meeting midway: Improving CMP performance with memory-side prefetching , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[35] Yong Chen,et al. HMC-Sim: A Simulation Framework for Hybrid Memory Cube Devices , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.
[36] Krishna M. Kavi,et al. HBM-Resident Prefetching for Heterogeneous Memory System , 2017, ARCS.
[37] William J. Dally,et al. Architecting an Energy-Efficient DRAM System for GPUs , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[38] Scott A. Mahlke,et al. WarpPool: Sharing requests with inter-warp coalescing for throughput processors , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[39] Satish Narayanasamy,et al. InvisiMem: Smart memory defenses for memory bus side channel , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[40] Sudhakar Yalamanchili,et al. Demystifying the characteristics of 3D-stacked memories: A case study for Hybrid Memory Cube , 2017, 2017 IEEE International Symposium on Workload Characterization (IISWC).