Accelerator-Centered Programming on Heterogeneous Systems
暂无分享,去创建一个
[1] Xuhao Chen,et al. Adaptive Cache Management for Energy-Efficient GPU Computing , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[2] Satoshi Matsuoka,et al. Linpack evaluation on a supercomputer with heterogeneous accelerators , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[3] Fred G. Gustavson,et al. Recursion leads to automatic variable blocking for dense linear-algebra algorithms , 1997, IBM J. Res. Dev..
[4] Jungwon Kim,et al. Accelerating LINPACK with MPI-OpenCL on Clusters of Multi-GPU Nodes , 2015, IEEE Transactions on Parallel and Distributed Systems.
[5] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[6] P. G. Hipes,et al. Gauss-Jordan inversion with pivoting on the Caltech Mark II hypercube , 1989, C3P.
[7] Eric F. van de Velde,et al. Experiments with Multicomputer LU-decomposition , 1990, Concurr. Pract. Exp..
[8] Pradeep Dubey,et al. Design and Implementation of the Linpack Benchmark for Single and Multi-node Systems Based on Intel® Xeon Phi Coprocessor , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[9] Michael Lang,et al. The reverse-acceleration model for programming petascale hybrid systems , 2009, IBM J. Res. Dev..
[10] John J. Cannon,et al. The Magma Algebra System I: The User Language , 1997, J. Symb. Comput..
[11] Jack J. Dongarra,et al. LU Factorization with Partial Pivoting for a Multicore System with Accelerators , 2013, IEEE Transactions on Parallel and Distributed Systems.
[12] Jack J. Dongarra,et al. Towards dense linear algebra for hybrid GPU accelerated manycore systems , 2009, Parallel Comput..
[13] Jack Dongarra,et al. Numerical Linear Algebra for High-Performance Computers , 1998 .
[14] Kim M. Hazelwood,et al. Where is the data? Why you cannot debate CPU vs. GPU performance without the answer , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.
[15] Tao Tang,et al. OpenMC: Towards Simplifying Programming for TianHe Supercomputers , 2014, Journal of Computer Science and Technology.
[16] Geoffrey C. Fox,et al. Solving problems on concurrent processors: vol. 2 , 1990 .
[17] Pradeep Dubey,et al. Designing and dynamically load balancing hybrid LU for multi/many-core , 2011, Computer Science - Research and Development.
[18] Feng Wang. GPU-centered parallel model on heterogeneous multi-GPU clusters , 2012, Proceedings of 2012 2nd International Conference on Computer Science and Network Technology.