Dymaxion: Optimizing memory access patterns for heterogeneous systems
暂无分享,去创建一个
[1] Bingsheng He,et al. Efficient gather and scatter operations on graphics processors , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[2] Teresa H. Y. Meng,et al. Merge: a programming model for heterogeneous multi-core systems , 2008, ASPLOS.
[3] Michael Garland,et al. Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .
[4] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[5] Hyesoon Kim,et al. Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[6] FoleyDenis,et al. AMD Fusion APU , 2012 .
[7] Kevin Skadron,et al. Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[8] Yao Zhang,et al. Scan primitives for GPU computing , 2007, GH '07.
[9] Kevin Skadron,et al. A performance study of general-purpose applications on graphics processors using CUDA , 2008, J. Parallel Distributed Comput..
[10] David R. Kaeli,et al. Exploiting Memory Access Patterns to Improve Memory Performance in Data-Parallel Architectures , 2011, IEEE Transactions on Parallel and Distributed Systems.
[11] Andrew S. Grimshaw,et al. Parallel Scan for Stream Architectures , 2012 .
[12] Kim M. Hazelwood,et al. Where is the data? Why you cannot debate CPU vs. GPU performance without the answer , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.
[13] Xipeng Shen,et al. On-the-fly elimination of dynamic irregularities for GPU computing , 2011, ASPLOS XVI.
[14] Erik Lindholm,et al. NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.
[15] David W. Nellans,et al. Micro-pages: increasing DRAM efficiency with locality-aware data placement , 2010, ASPLOS XV.
[16] John Zahorjan,et al. Optimizing Data Locality by Array Restructuring , 1995 .
[17] Zhen Fang,et al. The Impulse Memory Controller , 2001, IEEE Trans. Computers.