Error detection in large-scale parallel programs with long runtimes
暂无分享,去创建一个
[1] Bernard Tourancheau,et al. The Design of the General Parallel Monitoring System , 1992, Programming Environments for Parallel Computing.
[2] Message P Forum,et al. MPI: A Message-Passing Interface Standard , 1994 .
[3] Stuart I. Feldman,et al. IGOR: a system for program debugging via reversible execution , 1988, PADD '88.
[4] James S. Plank. An Overview of Checkpointing in Uniprocessor and Distributed Systems, Focusing on Implementation and , 1997 .
[5] James S. Plank,et al. An Overview of Checkpointing in Uniprocessor and DistributedSystems, Focusing on Implementation and Performance , 1997 .
[6] L. Alvisi,et al. A Survey of Rollback-Recovery Protocols , 2002 .
[7] Achour Mostéfaoui,et al. Communication-Induced Determination of Consistent Snapshots , 1999, IEEE Trans. Parallel Distributed Syst..
[8] Jack C. Wileden,et al. High-level debugging of distributed systems: The behavioral abstraction approach , 1983, J. Syst. Softw..
[9] Robert Hood. The p2d2 project: building a portable distributed debugger , 1996, SPDT '96.
[10] José C. Cunha,et al. An experiment in tool integration: The DDBG parallel and distributed debugger , 1999, J. Syst. Archit..
[11] Jack Dongarra,et al. Pvm 3 user's guide and reference manual , 1993 .
[12] Mukesh Singhal,et al. Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing Systems , 2001, IEEE Trans. Parallel Distributed Syst..
[13] David W. Binkley,et al. Program slicing , 2008, 2008 Frontiers of Software Maintenance.
[14] Dieter Kranzlmüller. Incremental Tracing and Process Isolation for Debugging Parallel Programs , 2000, Comput. Artif. Intell..
[15] Robert H. B. Netzer,et al. Optimal tracing and incremental reexecution for debugging long-running programs , 1994, PLDI '94.
[16] Dieter Kranzlmüller,et al. An Integrated Record&Replay Mechanism for Nondeterministic Message Passing Programs , 2001, PVM/MPI.
[17] Leslie Lamport,et al. Time, clocks, and the ordering of events in a distributed system , 1978, CACM.
[18] Friedel Hossfeld,et al. Teraflops Computing: A Challenge to Parallel Numerics? , 1999, ACPC.
[19] Jong-Deok Choi,et al. Techniques for debugging parallel programs with flowback analysis , 1991, TOPL.
[20] Martin Stitt. Debugging: Creative Techniques and Tools for Software Repair , 1992 .
[21] Dieter Kranzlmüller,et al. Debugging OpenMP Programs Using Event Manipulation , 2001, WOMPAT.
[22] Dieter Kranzlmuller,et al. Event Graph Analysis for Debugging Massively Parallel Programs , 2000 .
[23] Francine Berman,et al. Panorama: a portable, extensible parallel debugger , 1993, PADD '93.
[24] Franco Zambonelli,et al. An efficient logging algorithm for incremental replay of message-passing applications , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.
[25] Leslie Lamport,et al. Distributed snapshots: determining global states of distributed systems , 1985, TOCS.
[26] Robert Balzer,et al. EXDAMS: extendable debugging and monitoring system , 1969, AFIPS '69 (Spring).
[27] Jason Gait,et al. A probe effect in concurrent programs , 1986, Softw. Pract. Exp..
[28] Henryk Krawczyk,et al. Analysis and Testing of Distributed Software Applications , 1998 .
[29] Andreas Zeller. Visual debugging with ddd , 2001 .
[30] Eugene H. Spafford,et al. An execution-backtracking approach to debugging , 1991, IEEE Software.