Fault-tolerant parallel applications using a network of workstations
暂无分享,去创建一个
[1] Santosh K. Shrivastava,et al. Rajdoot: A Remote Procedure Call Mechanism Supporting Orphan Detection and Killing , 1988, IEEE Trans. Software Eng..
[2] Ian T. Foster,et al. Productive Parallel Programming: The PCN Approach , 1995, Sci. Program..
[3] Robert D. Silverman. Massively distributed computing and factoring large integers , 1991, CACM.
[4] Robert M. Hyatt,et al. Construction of a fault-tolerant distributed tuple-space , 1993, SAC '93.
[5] Jeffrey S. Chase,et al. The Amber system: parallel programming on a network of multiprocessors , 1989, SOSP '89.
[6] Karpjoo Jeong,et al. Fault-tolerant Parallel Processing Combining Linda, Checkpointing, and Transactions , 1996 .
[7] Leslie Lamport,et al. Distributed snapshots: determining global states of distributed systems , 1985, TOCS.
[8] Paul Hudak,et al. Memory coherence in shared virtual memory systems , 1989, TOCS.
[9] Barbara Liskov,et al. A design for a fault-tolerant, distributed implementation of Linda , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.
[10] Ian Foster,et al. Designing and building parallel programs , 1994 .
[11] Horst Langendörfer,et al. Load balancing and fault tolerance in workstation clusters migrating groups of communicating processes , 1995, OPSR.
[12] Andrew Birrell,et al. Implementing remote procedure calls , 1984, TOCS.
[13] Lily B. Mummert,et al. Camelot and Avalon: A Distributed Transaction Facility , 1991 .
[14] Seth Copen Goldstein,et al. Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.
[15] Andrew H. Sherman,et al. Ray Tracing with Network Linda , 1996, Applications on Advanced Architecture Computers.
[16] Miron Livny,et al. The Available Capacity of a Privately Owned Workstation Environmont , 1991, Perform. Evaluation.
[17] Partha Dasgupta,et al. Parallel processing on networks of workstations: a fault-tolerant, high performance approach , 1995, Proceedings of 15th International Conference on Distributed Computing Systems.
[18] Joel H. Saltz,et al. Data parallel programming in an adaptive environment , 1995, Proceedings of 9th International Parallel Processing Symposium.
[19] Clemens H. Cap,et al. Massive Parallelism with Workstation Clusters: Challenge or Nonsense? , 1994, HPCN.
[20] David L. Presotto,et al. Publishing: a reliable broadcast communication mechanism , 1983, SOSP '83.
[21] Jack J. Dongarra,et al. Key Concepts for Parallel Out-of-Core LU Factorization , 1996, Parallel Comput..
[22] Michael J. Flynn,et al. Very high-speed computing systems , 1966 .
[23] William Jalby,et al. Impact of Hierarchical Memory Systems On Linear Algebra Algorithm Design , 1988 .
[24] Jonathan M. Smith,et al. A taxonomy-based comparison of several distributed shared memory systems , 1990, OPSR.
[25] L. Peterson,et al. Cluster-C * : Understanding the Performance Limits , 1994 .
[26] Rajeev Thakur,et al. An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays , 1996, Sci. Program..
[27] E. B. Moss,et al. Nested Transactions: An Approach to Reliable Distributed Computing , 1985 .
[28] Monica S. Lam,et al. Transparent Fault Tolerance for Parallel Applications on Networks of Workstations , 1996, USENIX Annual Technical Conference.
[29] Jack Dongarra,et al. An Introduction to the MPI Standard , 1995 .
[30] Kenneth P. Birman,et al. Using the ISIS resource manager for distributed, fault-tolerant computing , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.
[31] Fred Douglis,et al. Transparent process migration: Design alternatives and the sprite implementation , 1991, Softw. Pract. Exp..
[32] Alan L. Cox,et al. TreadMarks: shared memory computing on networks of workstations , 1996 .
[33] David P. Anderson,et al. Marionette: a system for parallel distributed programming using a master/slave model , 1989, [1989] Proceedings. The 9th International Conference on Distributed Computing Systems.
[34] Alan Edelman,et al. Large numerical linear algebra in 1994: the continuing influence of parallel computing , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.
[35] Kai Li,et al. Heterogeneous Distributed Shared Memory , 1992, IEEE Trans. Parallel Distributed Syst..
[36] Brian N. Bershad,et al. The Midway distributed shared memory system , 1993, Digest of Papers. Compcon Spring.
[37] Brian Randell,et al. System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.
[38] Garth A. Gibson,et al. RAID: high-performance, reliable secondary storage , 1994, CSUR.
[39] Willy Zwaenepoel,et al. Manetho: Transparent Rollback-Recovery with Low Overhead, Limited Rollback, and Fast Output Commit , 1992, IEEE Trans. Computers.
[40] Alok Choudhary,et al. VIP-FS: a VIrtual, Parallel File System for high performance parallel and distributed computing , 1995, Proceedings of 9th International Parallel Processing Symposium.
[41] Scott R Cannon,et al. Adding fault‐tolerant transaction processing to LINDA , 1994, Softw. Pract. Exp..
[42] Georg Stellner,et al. CoCheck: checkpointing and process migration for MPI , 1996, Proceedings of International Conference on Parallel Processing.
[43] Kai Hwang,et al. Advanced computer architecture - parallelism, scalability, programmability , 1992 .
[44] Miron Livny,et al. Managing Checkpoints for Parallel Programs , 1996, JSSPP.
[45] Maurice J. Bach. The Design of the UNIX Operating System , 1986 .
[46] Nicholas Carriero,et al. How to write parallel programs - a first course , 1990 .
[47] Allan Gottlieb,et al. Highly parallel computing , 1989, Benjamin/Cummings Series in computer science and engineering.
[48] Lorenzo Alvisi,et al. Paralex: an environment for parallel programming in distributed systems , 1991, ICS '92.
[49] Ewing L. Lusk,et al. Monitors, Messages, and Clusters: The p4 Parallel Programming System , 1994, Parallel Comput..
[50] Erik Seligman,et al. Dome: Parallel Programming in a Heteroge-neous Multi-User Environment , 1995 .
[51] Dror G. Feitelson,et al. Parallel I/O Systems and Interfaces for Parallel Computers , 1995 .
[52] Robert D. Blumofe,et al. Adaptive and Reliable ParallelComputing9 Networks of Workstations , 1997 .
[53] David Kaminsky. Adaptive parallelism with Piranha , 1995 .
[54] W. Richard Stevens,et al. Unix network programming , 1990, CCRV.
[55] W. Kent Fuchs,et al. Reduced overhead logging for rollback recovery in distributed shared memory , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.
[56] Miron Livny,et al. The DEC: processing scientific data over the Internet , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.
[57] Santosh K. Shrivastava,et al. A System for Fault-Tolerance Execution of Data and Compute Intensive Programs over a Network of Workstations , 1996, Euro-Par, Vol. I.
[58] Henri E. Bal,et al. Transparent fault-tolerance in parallel Orca programs , 1992 .
[59] George Karypis,et al. Introduction to Parallel Computing , 1994 .
[60] Timothy G. MATTSONz. Parallel Programming Systems for Workstation Clusters , 1993 .
[61] Andreas Reuter,et al. Transaction Processing: Concepts and Techniques , 1992 .
[62] Werner Almesbergerwerner. ATM on Linux , 1996 .
[63] Jonathan Walpole,et al. MPVM: A Migration Transparent Version of PVM , 1995, Comput. Syst..
[64] Willy Zwaenepoel,et al. Implementation and performance of Munin , 1991, SOSP '91.
[65] Ian T. Foster,et al. Overview of the I-Way: Wide-Area Visual Supercomputing , 1996, Int. J. High Perform. Comput. Appl..
[66] Richard D. Schlichting,et al. Supporting Fault-Tolerant Parallel Programming in Linda , 1995, IEEE Trans. Parallel Distributed Syst..
[67] Robert Prouty,et al. Adaptive Execution of Data Parallel Computations on Networks of Heterogeneous Workstations , 1994 .
[68] John H. Hartman,et al. The Zebra striped network file system , 1995, TOCS.
[69] LiskovBarbara,et al. Implementation of resilient, atomic data types , 1985 .
[70] Michael J. Quinn,et al. Data-parallel programming on a network of heterogeneous workstations , 1992, Proceedings of the First International Symposium on High-Performance Distributed Computing. (HPDC-1).
[71] Leslie Lamport,et al. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.
[72] David S. Greenberg,et al. Beyond core: Making parallel computer I/O practical , 1993 .
[73] Gilbert Cabillic,et al. The performance of consistent checkpointing in distributed shared memory systems , 1995, Proceedings. 14th Symposium on Reliable Distributed Systems.
[74] Andrea C. Arpaci-Dusseau,et al. Parallel programming in Split-C , 1993, Supercomputing '93. Proceedings.
[75] Jim Smith,et al. On Synchronisation in Fault-Tolerant Data and Compute Intensive Programs over a Network of Workstations , 1997, Euro-Par.
[76] David E. Culler,et al. A case for NOW (networks of workstation) , 1995, PODC '95.
[77] J. Maier,et al. Fault-tolerant parallel programming with atomic actions , 1994, Proceedings of IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems.
[78] Alok N. Choudhary,et al. Improved parallel I/O via a two-phase run-time access strategy , 1993, CARN.
[79] Jack J. Dongarra,et al. The PVM Concurrent Computing System: Evolution, Experiences, and Trends , 1994, Parallel Comput..
[80] Barbara Liskov,et al. Distributed programming in Argus , 1988, CACM.
[81] Thorsten von Eicken,et al. U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.
[82] J. Dongarra. Performance of various computers using standard linear equations software , 1990, CARN.
[83] Robbert van Renesse,et al. Experiences with the Amoeba distributed operating system , 1990, CACM.
[84] Henri E. Bal. Fault-tolerant parallel programming in Argus , 1992, Concurr. Pract. Exp..
[85] Jack J. Dongarra,et al. Algorithm-based diskless checkpointing for fault tolerant matrix operations , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.
[86] Jack Dongarra,et al. HeNCE: graphical development tools for network-based concurrent computing , 1992, Proceedings Scalable High Performance Computing Conference SHPCC-92..
[87] Dror G. Feitelson,et al. Job Scheduling in Multiprogrammed Parallel Systems , 1997 .
[88] Thomas Ludwig,et al. PFSLib - An I/O Interface for Parallel Programming Environments on Coupled Workstations , 1995 .
[89] P. Dasgupta,et al. Implementing consistency control mechanisms in the Clouds distributed operating system , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.
[90] Santosh K. Shrivastava,et al. Performance of Fault-Tolerant Data and Compute Intensive Programs over a Network of Workstations , 1998, Theor. Comput. Sci..
[91] Philip A. Bernstein,et al. Implementing recoverable requests using queues , 1990, SIGMOD '90.
[92] Partha Dasgupta,et al. CALYPSO: a novel software system for fault-tolerant parallel processing on distributed platforms , 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing.
[93] Henri E. Bal,et al. Object distribution in Orca using Compile-Time and Run-Time techniques , 1993, OOPSLA '93.
[94] Ravi Mirchandaney,et al. Experiences with networked parallel computing , 1995, Concurr. Pract. Exp..
[95] Stuart M. Wheater. Constructing reliable distributed applications using actions and objects , 1989 .
[96] Steven A. Moyer,et al. PIOUS: a scalable parallel I/O system for distributed computing environments , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.