Application-oriented ping-pong benchmarking: how to assess the real communication overheads
暂无分享,去创建一个
Torsten Hoefler | Timo Schneider | Robert Gerstenberger | T. Hoefler | Timo Schneider | R. Gerstenberger
[1] Hubert Ritzdorf,et al. Flattening on the Fly: Efficient Handling of MPI Derived Datatypes , 1999, PVM/MPI.
[2] Jeroen Tromp,et al. High-frequency simulations of global seismic wave propagation using SPECFEM3D_GLOBE on 62K processors , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[3] William C. Skamarock,et al. A time-split nonhydrostatic atmospheric model for weather research and forecasting applications , 2008, J. Comput. Phys..
[4] Dhabaleswar K. Panda,et al. High performance implementation of MPI derived datatype communication over InfiniBand , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[5] Mohamed Sayeed,et al. HPC Benchmarking and Performance Evaluation With Realistic Applications , 2006 .
[6] Torsten Hoefler,et al. Performance Expectations and Guidelines for MPI Derived Datatypes , 2011, EuroMPI.
[7] Larry Kaplan,et al. The Gemini System Interconnect , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.
[8] A. Krasnitz,et al. Studying Quarks and Gluons On Mimd Parallel Computers , 1991, Int. J. High Perform. Comput. Appl..
[9] Alan B. Williams,et al. Poster: mini-applications: vehicles for co-design , 2011, SC '11 Companion.
[10] Jesper Larsson Träff,et al. A Benchmark for MPI Derived Datatypes , 2000, PVM/MPI.
[11] F. H. Mcmahon,et al. The Livermore Fortran Kernels: A Computer Test of the Numerical Performance Range , 1986 .
[12] T A Brunner. Mulard: A Multigroup Thermal Radiation Diffusion Mini-Application , 2012 .
[13] Alexander Aiken,et al. Optimal loop parallelization , 1988, PLDI '88.
[14] Torsten Hoefler,et al. Micro-applications for Communication Data Access Patterns and MPI Datatypes , 2012, EuroMPI.
[15] Surendra Byna,et al. Improving the performance of MPI derived datatypes by optimizing memory-access cost , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.
[16] Sandia Report,et al. Improving Performance via Mini-applications , 2009 .
[17] Torsten Hoefler,et al. Parallel Zero-Copy Algorithms for Fast Fourier Transform and Conjugate Gradient Using MPI Datatypes , 2010, EuroMPI.
[18] Steve Plimpton,et al. Fast parallel algorithms for short-range molecular dynamics , 1993 .
[19] Jesper Larsson Träff,et al. Using MPI Derived Datatypes in Numerical Libraries , 2011, EuroMPI.
[20] Kaivalya M. Dixit,et al. The SPEC benchmarks , 1991, Parallel Comput..
[21] Message Passing Interface Forum. MPI: A message - passing interface standard , 1994 .
[22] R. V. D. Wijngaart. NAS Parallel Benchmarks Version 2.4 , 2022 .