An experimental approach to performance measurement of heterogeneous parallel applications using CUDA
暂无分享,去创建一个
[1] Massimiliano Fatica. Accelerating linpack with CUDA on heterogenous clusters , 2009, GPGPU-2.
[2] Dieter Kranzlmüller,et al. Tools for Scalable Parallel Program Analysis - Vampir VNG and DeWiz , 2004, DAPSYS.
[3] Laxmikant V. Kale,et al. Programming Petascale Applications with Charm , 2007 .
[4] Allen D. Malony,et al. The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..
[5] Collin McCurdy,et al. The Scalable Heterogeneous Computing (SHOC) benchmark suite , 2010, GPGPU-3.
[6] Wolfgang E. Nagel,et al. Event Tracing and Visualization for Cell Broadband Engine Systems , 2008, Euro-Par.
[7] Wolfgang E. Nagel,et al. Introducing the Open Trace Format (OTF) , 2006, International Conference on Computational Science.
[8] Allen D. Malony,et al. Performance Measurement of Applications with GPU Acceleration using CUDA , 2009, PARCO.
[9] Laxmikant V. Kalé,et al. Scalable molecular dynamics with NAMD , 2005, J. Comput. Chem..
[10] Matthias S. Müller,et al. Tools for scalable parallel program analysis: Vampir NG, MARMOT, and DeWiz , 2009, Int. J. Comput. Sci. Eng..
[11] Laxmikant V. Kalé,et al. Integrated Performance Views in Charm++: Projections Meets TAU , 2009, 2009 International Conference on Parallel Processing.
[12] William Gropp,et al. From Trace Generation to Visualization: A Performance Framework for Distributed Parallel Systems , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[13] Allen D. Malony,et al. ParaProf: A Portable, Extensible, and Scalable Tool for Parallel Performance Profile Analysis , 2003, Euro-Par.