Self-Consistent MPI Performance Guidelines
暂无分享,去创建一个
[1] Ralf H. Reussner,et al. SKaMPI: A Detailed, Accurate MPI Benchmark , 1998, PVM/MPI.
[2] Jack Dongarra,et al. MPI - The Complete Reference: Volume 1, The MPI Core , 1998 .
[3] P. H. Worley. Comparison of Cray XT3 and XT4 Scalability , 2008 .
[4] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[5] Boyana Norris,et al. Computational Quality of Service in Parallel CFD , 2006 .
[6] Adolfy Hoisie,et al. A practical approach to performance analysis and modeling of large-scale systems , 2006, SC.
[7] Vivek Sarkar,et al. X10: concurrent programming for modern architectures , 2007, PPOPP.
[8] William Gropp,et al. Mpi - The Complete Reference: Volume 2, the Mpi Extensions , 1998 .
[9] Seyed H. Roosta. Principles of Parallel Programming , 2000 .
[10] Message P Forum,et al. MPI: A Message-Passing Interface Standard , 1994 .
[11] Jack Dongarra,et al. MPI: The Complete Reference , 1996 .
[12] Allen D. Malony,et al. Computational Quality of Service for Scientific CCA Applications: Composition, Substitution, and Reconfiguration , 2006 .
[13] Jesús Labarta,et al. Generation of Simple Analytical Models for Message Passing Applications , 2004, Euro-Par.
[14] Joachim Worringen. Automated Performance Comparison , 2006, PVM/MPI.
[15] Sergei Gorlatch,et al. Toward Formally-Based Design of Message Passing Programs , 2000, IEEE Trans. Software Eng..
[16] Stephen Booth,et al. Exchanging multiple messages via MPI , .
[17] Werner Augustin,et al. Usefulness and Usage of SKaMPI-Bench , 2003, PVM/MPI.
[18] Katherine Yelick,et al. UPC: Distributed Shared-Memory Programming , 2003 .
[19] Joachim Worringen. Experiment Management and Analysis with perfbase , 2005, 2005 IEEE International Conference on Cluster Computing.
[20] Sergei Gorlatch,et al. Send-receive considered harmful: Myths and realities of message passing , 2004, TOPL.
[21] Ralf H. Reussner. Using SKaMPI for developing high-performance MPI programs with performance portability , 2003, Future Gener. Comput. Syst..
[22] Robert A. van de Geijn,et al. Building a high-performance collective communication library , 1994, Proceedings of Supercomputing '94.
[23] Jaswinder Pal Singh,et al. Application restructuring and performance portability on shared virtual memory and hardware-coherent multiprocessors , 1997, PPOPP '97.
[24] Message Passing Interface Forum. MPI: A message - passing interface standard , 1994 .
[25] Kees Verstoep,et al. Fast Measurement of LogP Parameters for Message Passing Platforms , 2000, IPDPS Workshops.
[26] Jesús Labarta,et al. Validation of Dimemas Communication Model for MPI Collective Operations , 2000, PVM/MPI.
[27] Jesper Larsson Träff,et al. SKaMPI: a comprehensive benchmark for public benchmarking of MPI , 2002, Sci. Program..
[28] Patrick H. Worley,et al. Performance Portability in the Physical Parameterizations of the Community Atmospheric Model , 2005, Int. J. High Perform. Comput. Appl..
[29] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[30] Hubert Ritzdorf,et al. Collective operations in NEC's high-performance MPI libraries , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[31] Torsten Hoefler,et al. Netgauge: A Network Performance Measurement Framework , 2007, HPCC.
[32] Robert A. van de Geijn,et al. On optimizing collective communication , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).
[33] Jesper Larsson Träff,et al. Self-consistent MPI Performance Requirements , 2007, PVM/MPI.
[34] Phillip Colella,et al. Parallel Languages and Compilers: Perspective From the Titanium Experience , 2007, Int. J. High Perform. Comput. Appl..
[35] Thomas Rauber,et al. Optimizing MPI collective communication by orthogonal structures , 2006, Cluster Computing.
[36] Jesper Larsson Träff. An Improved Algorithm for (Non-commutative) Reduce-Scatter with an Application , 2005, PVM/MPI.
[37] Isabelle Guérin Lassous,et al. PRO: A Model for the Design and Analysis of Efficient and Scalable Parallel Algorithms , 2006, Nord. J. Comput..
[38] Robert B. Ross,et al. Self-consistent MPI-IO Performance Requirements and Expectations , 2008, PVM/MPI.
[39] Ramesh Subramonian,et al. LogP: a practical model of parallel computation , 1996, CACM.
[40] Mark M. Mathis,et al. A performance model of non-deterministic particle transport on large-scale systems , 2003, Future Gener. Comput. Syst..
[41] Rajeev Thakur,et al. Optimization of Collective Communication Operations in MPICH , 2005, Int. J. High Perform. Comput. Appl..
[42] Rajeev Thakur,et al. Improving the Performance of Collective Operations in MPICH , 2003, PVM/MPI.
[43] M. Plummer,et al. An LPAR-customized MPI_AllToAllV for the Materials Science code CASTEP , 2004 .