The Effect of Network Noise on Large-Scale Collective Communications
暂无分享,去创建一个
[1] Torsten Hoefler,et al. Multistage switches are not crossbars: Effects of static routing in high-performance networks , 2008, 2008 IEEE International Conference on Cluster Computing.
[2] Torsten Hoefler,et al. Accurately measuring collective operations at massive scale , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[3] Mohan Kumar,et al. On generalized fat trees , 1995, Proceedings of 9th International Parallel Processing Symposium.
[4] William J. Dally,et al. Performance Analysis of k-Ary n-Cube Interconnection Networks , 1987, IEEE Trans. Computers.
[5] José E. Moreira,et al. Blue Gene/L programming and operating environment , 2005, IBM J. Res. Dev..
[6] Pradipta De,et al. Impact of Noise on Scaling of Collectives: An Empirical Evaluation , 2006, HiPC.
[7] Enrico Vicario,et al. Interprocess Communication Dependency on Network Load , 1991, IEEE Trans. Software Eng..
[8] Ron Brightwell,et al. Characterizing application sensitivity to OS interference using kernel-level noise injection , 2008, HiPC 2008.
[9] Fumihiko Ino,et al. LogGPS: a parallel computational model for synchronization analysis , 2001, PPoPP '01.
[10] Scott Pakin,et al. The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8, 192 Processors of ASCI Q , 2003, SC.
[11] David A. Bader,et al. A measurement and simulation methodology for parallel computing performance studies , 2006 .
[12] Chris J. Scheiman,et al. LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation , 1995, SPAA '95.
[13] Kamil Iskra,et al. Characterizing the Performance of “Big Memory” on Blue Gene Linux , 2009, 2009 International Conference on Parallel Processing Workshops.
[14] Paul D. Gader,et al. Image algebra techniques for parallel image processing , 1987 .
[15] Ronald Mraz,et al. Reducing the variance of point to point transfers in the IBM 9076 parallel computer , 1994, Proceedings of Supercomputing '94.
[16] Charles E. Leiserson,et al. Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.
[17] Chris J. Scheiman,et al. LogGP: Incorporating Long Messages into the LogP Model for Parallel Computation , 1997, J. Parallel Distributed Comput..
[18] Torsten Hoefler,et al. Netgauge: A Network Performance Measurement Framework , 2007, HPCC.
[19] Darren J. Kerbyson,et al. Optimized InfiniBand TM fat-tree routing for shift all-to-all communication patterns , 2010, ISC 2010.
[20] William Gropp,et al. Reproducible Measurements of MPI Performance Characteristics , 1999, PVM/MPI.
[21] Torsten Hoefler,et al. ORCS : An Oblivious Routing Congestion Simulator , 2009 .
[22] Fabrizio Petrini,et al. Predictive Performance and Scalability Modeling of a Large-Scale Application , 2001, ACM/IEEE SC 2001 Conference (SC'01).
[23] Suzanne M. Kelly,et al. Software Architecture of the Light Weight Kernel, Catamount , 2005 .
[24] K. Bryan. A Numerical Method for the Study of the Circulation of the World Ocean , 1997 .
[25] Darren J. Kerbyson. A look at application performance sensitivity to the bandwidth and latency of InfiniBand networks , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[26] Ronald Minnich,et al. Analysis of microbenchmarks for performance tuning of clusters , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).
[27] J. M. McGlaun,et al. CTH: A software family for multi-dimensional shock physics analysis , 1995 .
[28] Allen D. Malony,et al. Overhead Compensation in Performance Profiling , 2004, Parallel Process. Lett..
[29] Nisheeth K. Vishnoi,et al. The Impact of Noise on the Scaling of Collectives: A Theoretical Approach , 2005, HiPC.
[30] Allen D. Malony,et al. The ghost in the machine: observing the effects of kernel operation on parallel application performance , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[31] David A. Bader,et al. Performance analysis of parallel programs via message-passing graph traversal , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[32] Allen D. Malony,et al. Trace-Based Parallel Performance Overhead Compensation , 2005, HPCC.