Set variation-aware shared LLC management for CPU-GPU heterogeneous architecture
暂无分享,去创建一个
Xin Li | Zhaoying Li | Hongjun Dai | Zhiping Jia | Mengying Zhao | Lei Ju
[1] Xian-He Sun,et al. DaCache: Memory Divergence-Aware GPU Cache Management , 2015, ICS.
[2] Irving L. Traiger,et al. Evaluation Techniques for Storage Hierarchies , 1970, IBM Syst. J..
[3] Zhiping Jia,et al. Shared last-level cache management for GPGPUs with hybrid main memory , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.
[4] John L. Henning. SPEC CPU2006 benchmark descriptions , 2006, CARN.
[5] Yu Wang,et al. Coordinated static and dynamic cache bypassing for GPUs , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[6] Antonia Zhai,et al. Managing shared last-level cache in a heterogeneous multicore processor , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[7] Yale N. Patt,et al. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[8] Chia-Lin Yang,et al. Latency sensitivity-based cache partitioning for heterogeneous multi-core architecture , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[9] Zhihua Wang,et al. Orchestrating Cache Management and Memory Scheduling for GPGPU Applications , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[10] David A. Wood,et al. gem5-gpu: A Heterogeneous CPU-GPU Simulator , 2015, IEEE Computer Architecture Letters.
[11] Hyesoon Kim,et al. TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture , 2012, IEEE International Symposium on High-Performance Comp Architecture.
[12] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[13] Luca Benini,et al. GPUguard: Towards supporting a predictable execution model for heterogeneous SoC , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.
[14] Shuaiwen Song,et al. Locality-Driven Dynamic GPU Cache Bypassing , 2015, ICS.
[15] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[16] Yun Liang,et al. An efficient compiler framework for cache bypassing on GPUs , 2013, ICCAD 2013.
[17] Keshav Pingali,et al. Lonestar: A suite of parallel irregular programs , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[18] David A. Wood,et al. Heterogeneous system coherence for integrated CPU-GPU systems , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[19] Aamer Jaleel,et al. High performance cache replacement using re-reference interval prediction (RRIP) , 2010, ISCA.
[20] Chita R. Das,et al. OSCAR: Orchestrating STT-RAM cache traffic for heterogeneous CPU-GPU architectures , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[21] Gabriel H. Loh,et al. PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches , 2009, ISCA '09.
[22] Sunggu Lee,et al. Hybrid DRAM/PRAM-based main memory for single-chip CPU/GPU , 2012, DAC Design Automation Conference 2012.
[23] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[24] Dongrui Fan,et al. Enabling coordinated register allocation and thread-level parallelism optimization for GPUs , 2018, MICRO.