An MPI Tool to Measure Application Sensitivity to Variation in Communication Parameters
暂无分享,去创建一个
[1] Sanguthevar Rajasekaran. Randomized Selection on the Hypercube , 1996, J. Parallel Distributed Comput..
[2] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[3] David A. Bader,et al. Practical parallel algorithms for dynamic data redistribution, median finding, and selection , 1995, Proceedings of International Conference on Parallel Processing.
[4] Chris J. Scheiman,et al. LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation , 1995, SPAA '95.
[5] John N. Shadid,et al. Official Aztec user''s guide: version 2.1 , 1999 .
[6] Charles L. Seitz,et al. Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.
[7] Steven J. Plimpton,et al. Parallel Molecular Dynamics With the Embedded Atom Method , 1992 .
[8] Henri E. Bal,et al. Bandwidth-efficient collective communication for clustered wide area systems , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[9] Mario Lauria,et al. Cross-Platform Analysis of Fast Messages for Myrinet , 1998, CANPC.
[10] D.E. Culler,et al. Effects Of Communication Latency, Overhead, And Bandwidth In A Cluster Architecture , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[11] Adolfy Hoisie,et al. Exploring advanced architectures using performance prediction , 2002, International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems.
[12] David A. Bader,et al. Practical parallel algorithms for personalized communication and integer sorting , 1996, JEAL.
[13] Maurice Yarrow,et al. New Implementations and Results for the NAS Parallel Benchmarks 2 , 1997, PPSC.
[14] Steven G. Johnson,et al. FFTW: an adaptive software architecture for the FFT , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[15] Anthony Skjellum,et al. A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..
[16] Kees Verstoep,et al. Fast Measurement of LogP Parameters for Message Passing Platforms , 2000, IPDPS Workshops.
[17] John L. Hennessy,et al. The Effects of Latency, Occupancy, and Bandwidth in Distributed Shared Memory Multiprocessors , 1995 .
[18] Robert D. Falgout,et al. Semicoarsening Multigrid on Distributed Memory Machines , 1999, SIAM J. Sci. Comput..
[19] John N. Shadid,et al. Parallel performance of a preconditioned CG solver for unstructured finite element applications , 1994 .
[20] Steve Plimpton,et al. Fast parallel algorithms for short-range molecular dynamics , 1993 .
[21] John N. Shadid,et al. Parallel sparse matrix vector multiply software for matrices with data locality , 1998, Concurr. Pract. Exp..
[22] Rod A. Fatoohi,et al. Performance evaluation of three distributed computing environments for scientific applications , 1994, Proceedings of Supercomputing '94.
[23] STEVE SCHAFFER,et al. A Semicoarsening Multigrid Method for Elliptic Partial Differential Equations with Highly Discontinuous and Anisotropic Coefficients , 1998, SIAM J. Sci. Comput..
[24] Foiles,et al. Embedded-atom-method functions for the fcc metals Cu, Ag, Au, Ni, Pd, Pt, and their alloys. , 1986, Physical review. B, Condensed matter.
[25] Jaswinder Pal Singh,et al. The effects of communication parameters on end performance of shared virtual memory clusters , 1997, SC '97.
[26] Ron Brightwell,et al. Instrumenting LogP parameters in GM: implementation and validation , 2002, 27th Annual IEEE Conference on Local Computer Networks, 2002. Proceedings. LCN 2002..
[27] David A. Bader. An Improved Randomized Selection Algorithm With an Experimental Study (Extended Abstract) , 1999 .
[28] M. Baskes,et al. Embedded-atom method: Derivation and application to impurities, surfaces, and other defects in metals , 1984 .
[29] Sanguthevar Rajasekaran,et al. Derivation of Randomized Sorting and Selection Algorithms , 1993 .
[30] Richard P. Martin,et al. Assessing Fast Network Interfaces , 1996, IEEE Micro.
[31] P. R. Cappello,et al. Implementing the beam and warming method on the hypercube , 1989, C3P.
[32] John N. Shadid,et al. Parallel sparse matrix vector multiply software for matrices with data locality , 1998 .
[33] Fabrizio Petrini,et al. Predictive Performance and Scalability Modeling of a Large-Scale Application , 2001, ACM/IEEE SC 2001 Conference (SC'01).
[34] SkjellumAnthony,et al. A high-performance, portable implementation of the MPI message passing interface standard , 1996 .