Detecting Load Imbalance in Massively Parallel Applications Internship Report
暂无分享,去创建一个
[1] Robert J. Fowler,et al. HPCVIEW: A Tool for Top-down Analysis of Node Performance , 2002, The Journal of Supercomputing.
[2] Ken Kennedy,et al. Automatic tuning of whole applications using direct search and a performance-based transformation system , 2006, The Journal of Supercomputing.
[3] William Cyrus Navidi,et al. Statistics for Engineers and Scientists , 2004 .
[4] Wagner Meira,et al. Waiting time analysis and performance visualization in Carnival , 1996, SPDT '96.
[5] Jack Dongarra,et al. Automating the Large-Scale Collection and Analysis of Performance , 2004 .
[6] Ying Zhang,et al. SvPablo: A Multi-language Performance Analysis System , 1998, Computer Performance Evaluation.
[7] Anthony P. Reeves,et al. Strategies for Dynamic Load Balancing on Highly Parallel Computers , 1993, IEEE Trans. Parallel Distributed Syst..
[8] Rick Kufrin,et al. PerfSuite: An Accessible, Open Source Performance Analysis Environment for Linux , 2005 .
[9] Martin Schulz,et al. Scalable load-balance measurement for SPMD codes , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[10] Thomas J. LeBlanc,et al. Parallel performance prediction using lost cycles analysis , 1994, Proceedings of Supercomputing '94.
[11] J. Mark Bull,et al. A hierarchical classification of overheads in parallel programs , 1996, Software Engineering for Parallel and Distributed Systems.
[12] Luiz De Rose,et al. Detecting Application Load Imbalance on High End Massively Parallel Systems , 2007, Euro-Par.
[13] Marc-André Hermanns,et al. Verifying Causality between Distant Performance Phenomena in Large-Scale MPI Applications , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.
[14] C. Glymour,et al. STATISTICS AND CAUSAL INFERENCE , 1985 .
[15] Dror G. Feitelson,et al. Flexible coscheduling: mitigating load imbalance and improving utilization of heterogeneous resources , 2003, Proceedings International Parallel and Distributed Processing Symposium.
[16] Craig B. Zilles,et al. A criticality analysis of clustering in superscalar processors , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[17] Martin Schulz,et al. An Open Infrastructure for Scalable, Reconfigurable Analysis , 2008 .
[18] James E. Smith,et al. Complexity-Effective Superscalar Processors , 1997, ISCA.
[19] Markus Geimer,et al. Scalable Performance Analysis Methods for the Next Generation of Supercomputers , 2008 .
[20] Rizos Sakellariou,et al. Compile-time minimisation of load imbalance in loop nests , 1997, ICS '97.
[21] Scott Pakin,et al. Identifying and Eliminating the Performance Variability on the ASCI Q Machine , 2003 .
[22] Bernd Mohr,et al. Scalable Parallel Trace-Based Performance Analysis , 2006, PVM/MPI.
[23] Marcelo H. Cintra,et al. A compiler cost model for speculative parallelization , 2007, TACO.
[24] Marc-André Hermanns,et al. Verifying Causal Connections between Distant Performance Phenomena in Large-Scale Message-Passing Applications , 2008 .
[25] Allen D. Malony,et al. Design and implementation of a parallel performance data management framework , 2005, 2005 International Conference on Parallel Processing (ICPP'05).
[26] Franco Zambonelli,et al. Diffusive load-balancing policies for dynamic applications , 1999, IEEE Concurr..
[27] Allen D. Malony,et al. ParaProf: A Portable, Extensible, and Scalable Tool for Parallel Performance Profile Analysis , 2003, Euro-Par.
[28] Emery D. Berger,et al. A locality-improving dynamic memory allocator , 2005, MSP '05.