E^2MC: Entropy Encoding Based Memory Compression for GPUs
暂无分享,去创建一个
[1] David A. Wood,et al. Frequent Pattern Compression: A Significance-Based Compression Scheme for L2 Caches , 2004 .
[2] Won Woo Ro,et al. Warped-Compression: Enabling power efficient GPUs through register compression , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[3] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[4] Nam Sung Kim,et al. GPUWattch: enabling energy optimizations in GPGPUs , 2013, ISCA.
[5] Mahmut T. Kandemir,et al. A case for Core-Assisted Bottleneck Acceleration in GPUs: Enabling flexible data compression with assist warps , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[6] Onur Mutlu,et al. Base-delta-immediate compression: Practical data compression for on-chip caches , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[7] Ben H. H. Juurlink,et al. GPGPU workload characteristics and performance analysis , 2014, 2014 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XIV).
[8] Xi Chen,et al. C-Pack: A High-Performance Microprocessor Cache Compression Algorithm , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[9] Martin Burtscher,et al. Microarchitectural performance characterization of irregular GPU kernels , 2014, 2014 IEEE International Symposium on Workload Characterization (IISWC).
[10] William J. Dally,et al. GPUs and the Future of Parallel Computing , 2011, IEEE Micro.
[11] Nam Sung Kim,et al. Lossless and lossy memory I/O link compression for improving performance of GPGPU workloads , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[12] Keshav Pingali,et al. Lonestar: A suite of parallel irregular programs , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[13] Onur Mutlu,et al. Memory scaling: A systems architecture perspective , 2013, 2013 5th IEEE International Memory Workshop.
[14] Bingsheng He,et al. Mars: Accelerating MapReduce with Graphics Processors , 2011, IEEE Transactions on Parallel and Distributed Systems.
[15] Per Stenström,et al. SC2: A statistical compression cache scheme , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[16] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[17] Eugene S. Schwartz,et al. Generating a canonical prefix encoding , 1964, CACM.
[18] Ben H. H. Juurlink,et al. How a single chip causes massive power bills GPUSimPow: A GPGPU power simulator , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[19] Collin McCurdy,et al. The Scalable Heterogeneous Computing (SHOC) benchmark suite , 2010, GPGPU-3.