Unresponsiveness -Tolerant Collective Communication

[1]  C. C. Huang,et al.  Multicast virtual topologies for collective communication in MPCs and ATM clusters , 1995 .

[2]  Laxmikant V. Kalé,et al.  Converse: an interoperable framework for parallel programming , 1996, Proceedings of International Conference on Parallel Processing.

[3]  ZHANGLi-xia,et al.  A reliable multicast framework for light-weight sessions and application level framing , 1995 .

[4]  Stephen E. Deering,et al.  Host extensions for IP multicasting , 1986, RFC.

[5]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[6]  Todd Montgomery,et al.  A Loss Tolerant Rate Controller for Reliable Multicast , 1997 .

[7]  Jehoshua Bruck,et al.  CCL: A Portable and Tunable Collective Communication Library for Scalable Parallel Computers , 1995, IEEE Trans. Parallel Distributed Syst..

[8]  Andrea C. Arpaci-Dusseau,et al.  Effective distributed scheduling of parallel workloads , 1996, SIGMETRICS '96.

[9]  André Schiper,et al.  Lightweight causal and atomic group multicast , 1991, TOCS.

[10]  Henri E. Bal,et al.  Efficient multicast on Myrinet using link-level flow control , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).

[11]  Craig Partridge,et al.  A faster UDP , 1993, TNET.

[12]  W. Daniel Hillis,et al.  Data parallel algorithms , 1986, CACM.

[13]  Massimo Bernaschi,et al.  Collective communication operations: experimental results vs. theory , 1998 .

[14]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[15]  Robert D. Blumofe,et al.  Adaptive and Reliable ParallelComputing9 Networks of Workstations , 1997 .

[16]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[17]  Tom Shanley,et al.  PCI System Architecture , 1993 .

[18]  Scott Pakin,et al.  Dynamic Coscheduling on Workstation Clusters , 1998, JSSPP.

[19]  Jon Postel,et al.  User Datagram Protocol , 1980, RFC.

[20]  Robert J. Harrison,et al.  Global Arrays: a portable "shared-memory" programming model for distributed memory computers , 1994, Proceedings of Supercomputing '94.

[21]  Francis J. Aguilar Cray Research, Inc , 2002 .

[22]  Bernard Tourancheau,et al.  BIP: A New Protocol Designed for High Performance Networking on Myrinet , 1998, IPPS/SPDP Workshops.

[23]  Thorsten von Eicken,et al.  Incorporating Memory Management into User-Level Network Interfaces , 1997 .

[24]  Steven L. Scott,et al.  Synchronization and communication in the T3E multiprocessor , 1996, ASPLOS VII.

[25]  Larry L. Peterson,et al.  Fbufs: a high-bandwidth cross-domain transfer facility , 1994, SOSP '93.

[26]  H.H.J. Hum,et al.  Polling Watchdog: Combining Polling and Interrupts for Efficient Message Handling , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[27]  Robert A. van de Geijn,et al.  Fast Collective Communication Libraries, Please , 1995 .

[28]  Mitsuhisa Sato,et al.  PM: An Operating System Coordinated High Performance Communication Library , 1997, HPCN Europe.

[29]  Helen Custer,et al.  Inside Windows NT , 1992 .

[30]  Kenneth P. Birman,et al.  Design Alternatives for Process Group Membership and Multicast , 1991 .

[31]  Rajiv Gupta The fuzzy barrier: a mechanism for high speed synchronization of processors , 1989, ASPLOS 1989.

[32]  D. B. Davis,et al.  Intel Corp. , 1993 .

[33]  Andrea C. Arpaci-Dusseau,et al.  Scheduling with implicit information in distributed systems , 1998, SIGMETRICS '98/PERFORMANCE '98.

[34]  Jo-Mei Chang,et al.  Reliable broadcast protocols , 1984, TOCS.

[35]  Anoop Gupta,et al.  Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, ISCA '90.

[36]  D. Lenoski,et al.  The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[37]  Chung Laung Liu,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[38]  Todd Montgomery,et al.  A High Performance Totally Ordered Multicast Protocol , 1994, Dagstuhl Seminar on Distributed Systems.

[39]  C.R. Johnson,et al.  SCIRun: A Scientific Programming Environment for Computational Steering , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[40]  David M. Nicol,et al.  Noncommittal Barrier Synchronization , 1995, Parallel Comput..

[41]  Alan O. Freier,et al.  Multicast Transport Protocol , 1992, RFC.

[42]  John K. Ousterhout Scheduling Techniques for Concurrebt Systems. , 1982, ICDCS 1982.

[43]  Yuri Petrovich Ofman,et al.  On the Algorithmic Complexity of Discrete Functions , 1962 .

[44]  Samuel J. Leffler,et al.  A 4.2bsd Interprocess Communication Primer , 1983 .

[45]  Butler W. Lampson,et al.  Reliable messages and connection establishment , 1993 .

[46]  W. Daniel Hillis,et al.  The Network Architecture of the Connection Machine CM-5 , 1996, J. Parallel Distributed Comput..

[47]  Al Geist,et al.  Network-based concurrent computing on the PVM system , 1992, Concurr. Pract. Exp..

[48]  Henri E. Bal,et al.  MagPIe: MPI's collective communication operations for clustered wide area systems , 1999, PPoPP '99.

[49]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[50]  Scott Pakin,et al.  Design and Evaluation of an HPVM-Based Windows NT Supercomputer , 1999, Int. J. High Perform. Comput. Appl..

[51]  Henry G. Dietz,et al.  A Parallel Processing Support Library Based on Synchronized Aggregate Communication , 1995, LCPC.

[52]  Tilak Agerwala,et al.  SP2 System Architecture , 1999, IBM Syst. J..

[53]  W. Daniel Hillis,et al.  The connection machine , 1985 .

[54]  Evgenia Smirni,et al.  The next frontier: interactive and closed loop performance steering , 1996, 1996 Proceedings ICPP Workshop on Challenges for Parallel Processing.

[55]  George B. Adams,et al.  SLICC: a low latency interface for collective communications , 1994, Proceedings of Supercomputing '94.

[56]  Brian N. Bershad,et al.  Scheduler activations: effective kernel support for the user-level management of parallelism , 1991, TOCS.

[57]  Patrick Sobalvarro,et al.  Demand-Based Coscheduling of Parallel Jobs on Multiprogrammed Multiprocessors , 1995, JSSPP.

[58]  David Clark,et al.  An analysis of TCP processing overhead , 1989 .

[59]  Larry Peterson,et al.  TCP Vegas: new techniques for congestion detection and avoidance , 1994, SIGCOMM 1994.

[60]  Ozalp Babaoglu,et al.  Consistent global states of distributed systems: fundamental concepts and mechanisms , 1993 .

[61]  Katherine Guo,et al.  Scalability of the microsoft cluster service , 1998 .

[62]  Sanjoy Paul,et al.  RMTP: a reliable multicast transport protocol , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[63]  David Mosberger,et al.  Memory consistency models , 1993, OPSR.

[64]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[65]  Jon Beecroft,et al.  Meiko CS-2 Interconnect Elan-Elite Design , 1994, Parallel Comput..

[66]  Cezary Dubnicki,et al.  VMMC-2 : Efficient Support for Reliable, Connection-Oriented Communication , 1997 .

[67]  H. F. Jordan A Special Purpose Architecture for Finite Element Analysis , 1978 .

[68]  Michael J. Flynn,et al.  Very high-speed computing systems , 1966 .

[69]  Amotz Bar-Noy,et al.  Designing broadcasting algorithms in the postal model for message-passing systems , 1992, SPAA '92.

[70]  William E. Weihl,et al.  Reducing synchronization overhead in parallel simulation , 1996, Workshop on Parallel and Distributed Simulation.

[71]  Robert Metcalfe,et al.  Ethernet: distributed packet switching for local computer networks , 1976, CACM.

[72]  Scott Pakin,et al.  Fast messages: efficient, portable communication for workstation clusters and MPPs , 1997, IEEE Concurrency.

[73]  Alex Koifman,et al.  RAMP: a reliable adaptive multicast protocol , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[74]  Marco Fillo,et al.  Architecture and implementation of MEMORY CHANNEL 2 , 1997 .

[75]  Kees Verstoep,et al.  Efficient reliable multicast on Myrinet , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[76]  Eric A. Brewer,et al.  How to get good performance from the CM-5 data network , 1994, Proceedings of 8th International Parallel Processing Symposium.

[77]  Edward W. Felten,et al.  Reducing waiting costs in user-level communication , 1997, Proceedings 11th International Parallel Processing Symposium.