Improving Performance and Energy Efficiency of
暂无分享,去创建一个
[1] Zizhong Chen,et al. Performance of MPI broadcast algorithms , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[2] James Demmel,et al. Communication efficient gaussian elimination with partial pivoting using a shape morphing data layout , 2013, SPAA.
[3] Robert A. van de Geijn,et al. Collective communication on architectures that support simultaneous communication over multiple links , 2006, PPoPP '06.
[4] Thomas Rauber,et al. Automatic Tuning of PDGEMM Towards Optimal Performance , 2005, Euro-Par.
[5] Charles E. Leiserson,et al. On-the-fly pipeline parallelism , 2013, SPAA.
[6] James Demmel,et al. Communication optimal parallel multiplication of sparse random matrices , 2013, SPAA.
[7] James Demmel,et al. Improving communication performance in dense linear algebra via topology aware collectives , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[8] Mahmut T. Kandemir,et al. Reducing power with performance constraints for parallel sparse applications , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[9] Dong Li,et al. PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications , 2010, IEEE Transactions on Parallel and Distributed Systems.
[10] Xin Yuan,et al. Automatic generation and tuning of MPI collective communication routines , 2005, ICS '05.
[11] Jaeyoung Choi,et al. Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines , 1994, Sci. Program..
[12] Xin Yuan,et al. CC--MPI: a compiled communication capable MPI prototype for ethernet switched clusters , 2003, PPoPP '03.