Elastic and scalable tracing and accurate replay of non-deterministic events
暂无分享,去创建一个
[1] J. Larus. Whole program paths , 1999, PLDI '99.
[2] Allen D. Malony,et al. The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..
[3] Wenguang Chen,et al. MPIWiz: subgroup reproducible replay of mpi applications , 2009, PPoPP '09.
[4] Jeffrey S. Vetter,et al. Statistical scalability analysis of communication operations in distributed applications , 2001, PPoPP '01.
[5] Qiang Xu,et al. Logicalization ' ' of MPI Communication Traces , 2008 .
[6] Frank Mueller,et al. ScalaBenchGen: Auto-Generation of Communication Benchmarks Traces , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[7] Qiang Xu,et al. Construction and evaluation of coordinated performance skeletons , 2008, HiPC'08.
[8] Bernd Mohr,et al. The Scalasca performance toolset architecture , 2010, Concurr. Comput. Pract. Exp..
[9] Jeffrey K. Hollingsworth,et al. SIGMA: A Simulator Infrastructure to Guide Memory Analysis , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[10] Scott Pakin,et al. Automatic Generation of Executable Communication Specifications from Parallel Applications , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[11] Sally A. McKee,et al. METRIC: tracking down inefficiencies in the memory hierarchy via binary rewriting , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..
[12] Adolfy Hoisie,et al. Performance and Scalability Analysis of Teraflop-Scale Parallel Architectures Using Multidimensional Wavefront Applications , 2000, Int. J. High Perform. Comput. Appl..
[13] Frank Mueller,et al. Memory Trace Compression and Replay for SPMD Systems using Extended PRSDs? , 2011, PERV.
[14] Martin Schulz,et al. Preserving time in large-scale communication traces , 2008, ICS '08.
[15] E. N. Elnozahy. Address trace compression through loop detection and reduction , 1999, SIGMETRICS '99.
[16] Sriram Krishnamoorthy,et al. Scalable Communication Trace Compression , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.
[17] Robert J. Fowler,et al. Scalable methods for monitoring and detecting behavioral equivalence classes in scientific codes , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[18] Dieter Kranzlmüller,et al. Rolt/sup MP/-replay of Lamport timestamps for message passing systems , 1998, Proceedings of the Sixth Euromicro Workshop on Parallel and Distributed Processing - PDP '98 -.
[19] Wolfgang E. Nagel,et al. Construction and compression of complete call graphs for post-mortem program trace analysis , 2005, 2005 International Conference on Parallel Processing (ICPP'05).
[20] Ken Kennedy,et al. An Implementation of Interprocedural Bounded Regular Section Analysis , 1991, IEEE Trans. Parallel Distributed Syst..
[21] Xing Wu,et al. Probabilistic Communication and I/O Tracing with Deterministic Replay at Scale , 2011, 2011 International Conference on Parallel Processing.
[22] Nathan Froyd,et al. Low-overhead call path profiling of unmodified, optimized code , 2005, ICS '05.
[23] Martin Schulz,et al. Scalable load-balance measurement for SPMD codes , 2008, HiPC 2008.
[24] Bronis R. de Supinski,et al. A hybrid hardware/software approach to efficiently determine cache coherence Bottlenecks , 2005, ICS '05.
[25] Toni Cortes,et al. PARAVER: A Tool to Visualize and Analyze Parallel Code , 2007 .
[26] Nathan R. Tallent,et al. HPCTOOLKIT: tools for performance analysis of optimized parallel programs http://hpctoolkit.org , 2010 .
[27] Martin Burtscher,et al. VPC3: a fast and effective trace-compression algorithm , 2004, SIGMETRICS '04/Performance '04.
[28] Martin Schulz,et al. Clustering performance data efficiently at massive scales , 2010, ICS '10.
[29] Wolfgang E. Nagel,et al. Introducing the Open Trace Format (OTF) , 2006, International Conference on Computational Science.
[30] Ian H. Witten,et al. Linear-time, incremental hierarchy inference for compression , 1997, Proceedings DCC '97. Data Compression Conference.
[31] Sally A. McKee,et al. METRIC: Memory tracing via dynamic binary rewriting to identify cache inefficiencies , 2007, TOPL.
[32] Craig G. Nevill-Manning,et al. Compression and Explanation Using Hierarchical Grammars , 1997, Comput. J..
[33] Wenguang Chen,et al. PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node , 2010, PPoPP '10.
[34] Frank Mueller,et al. ScalaExtrap: trace-based communication extrapolation for spmd programs , 2011, PPoPP '11.
[35] Martin Schulz,et al. ScalaTrace: Scalable compression and replay of communication traces for high-performance computing , 2008, J. Parallel Distributed Comput..
[36] Wenguang Chen,et al. FACT: fast communication trace collection for parallel applications through program slicing , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[37] James W. Hurrell. The CommuniTy earTh SySTem model , 2013 .
[38] Interner Bericht. VAMPIR: Visualization and Analysis of MPI Resources , 1996 .
[39] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..