Performance modeling of communication and computation in hybrid MPI and OpenMP applications
暂无分享,去创建一个
[1] Ralf H. Reussner,et al. SKaMPI: A Detailed, Accurate MPI Benchmark , 1998, PVM/MPI.
[2] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[3] Rick Kufrin,et al. PerfSuite: An Accessible, Open Source Performance Analysis Environment for Linux , 2005 .
[4] Chris J. Scheiman,et al. LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation , 1995, SPAA '95.
[5] Andrew Wolfe,et al. Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture , 2000, MICRO 2000.
[6] Danesh K. Tafti,et al. A Parallel Computing Framework for Dynamic Power Balancing in Adaptive Mesh Refinement Applications , 2000 .
[7] William Gropp,et al. Reproducible Measurements of MPI Performance Characteristics , 1999, PVM/MPI.
[8] Jesper Larsson Träff,et al. SKaMPI: a comprehensive benchmark for public benchmarking of MPI , 2002, Sci. Program..
[9] Martin B. van Gijzen,et al. Two Level Parallelism in a Stream-Function Model for Global Ocean Circulation , 2003, Euro-Par.
[10] D. S. Henty,et al. Performance of Hybrid Message-Passing and Shared-Memory Parallelism for Discrete Element Modeling , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[11] Leonid Oliker,et al. Effects of Ordering Strategies and Programming Paradigms on Sparse Matrix Computations , 2013, SIAM Rev..
[12] P. Aldo Moro. Conjugate-Gradients Algorithms : An MPI-OpenMP Implementation on Distributed Shared Memory Systems , 1999 .
[13] K. Liew,et al. Parallel-multigrid computation of unsteady incompressible viscous flows using a matrix-free implicit method and high-resolution characteristics-based scheme , 2005 .
[14] J. M. Bull,et al. Measuring Synchronisation and Scheduling Overheads in OpenMP , 2007 .
[15] Chen Ding,et al. Locality phase prediction , 2004, ASPLOS XI.
[16] Jesús Labarta,et al. A Framework for Performance Modeling and Prediction , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[17] G. Mahinthakumar,et al. A Hybrid Mpi-Openmp Implementation of an Implicit Finite-Element Code on Parallel Architectures , 2002, Int. J. High Perform. Comput. Appl..
[18] Patricia J. Teller,et al. Proceedings of the 2008 ACM/IEEE conference on Supercomputing , 2008, HiPC 2008.
[19] Jeffrey K. Hollingsworth,et al. Using Dynamic Tracing Sampling to Measure Long Running Programs , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[20] Kees Verstoep,et al. Fast Measurement of LogP Parameters for Message Passing Platforms , 2000, IPDPS Workshops.
[21] Michael E. Wolf,et al. Combining Loop Transformations Considering Caches and Scheduling , 2004, International Journal of Parallel Programming.
[22] Mary K. Vernon,et al. Parallel program performance prediction using deterministic task graph analysis , 2004, TOCS.
[23] John M. Mellor-Crummey,et al. Cross-architecture performance predictions for scientific applications using parameterized models , 2004, SIGMETRICS '04/Performance '04.
[24] Amitava Majumdar. Parallel performance study of Monte Carlo photon transport code on shared-, distributed-, and distributed-shared-memory architectures , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[25] Rainer Unland,et al. Objects, Components, Architectures, Services, and Applications for a Networked World , 2003, Lecture Notes in Computer Science.
[26] Yossi Matias,et al. Can shared-memory model serve as a bridging model for parallel computation? , 1997, SPAA '97.
[27] Kathryn S. McKinley,et al. A Compiler Optimization Algorithm for Shared-Memory Multiprocessors , 1998, IEEE Trans. Parallel Distributed Syst..
[28] Viera Sipková,et al. Exploiting Distributed-Memory and Shared-Memory Parallelism on Clusters of SMPs with Data Parallel Programs , 2004, International Journal of Parallel Programming.
[29] Mellor-CrummeyJohn,et al. Cross-architecture performance predictions for scientific applications using parameterized models , 2004 .
[30] Joseph JáJá,et al. Prefix computations on symmetric multiprocessors , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.
[31] Nectarios Koziris,et al. Performance comparison of pure MPI vs hybrid MPI-OpenMP parallelization models on SMP clusters , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[32] Luc Giraud,et al. Combining Shared and Distributed Memory Programming Models on Clusters of Symmetric Multiprocessors: Some Basic Promising Experiments , 2002, Int. J. High Perform. Comput. Appl..
[33] Franck Cappello,et al. MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[34] S AdveVikram,et al. Parallel program performance prediction using deterministic task graph analysis , 2004 .
[35] Roger W. Hockney,et al. The Communication Challenge for MPP: Intel Paragon and Meiko CS-2 , 1994, Parallel Computing.
[36] André Weinand. Eclipse - An Open Source Platform for the Next Generation of Development Tools , 2002, NetObjectDays.
[37] Rudolf Eigenmann,et al. Parallel programming with message passing and directives , 2001, Comput. Sci. Eng..
[38] Thomas Rauber,et al. A source code analyzer for performance prediction , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..