SCTP versus TCP for MPI

SCTP (Stream Control Transmission Protocol) is a recently standardized transport level protocol with several features that better support the communication requirements of parallel applications; these features are not present in traditional TCP (Transmission Control Protocol). These features make SCTP a good candidate as a transport level protocol for MPI (Message Passing Interface). MPI is a message passing middleware that is widely used to parallelize scientific and compute intensive applications. TCP is often used as the transport protocol for MPI in both local area and wide area networks. Prior to this work, SCTP has not been used for MPI. We compared and evaluated the benefits of using SCTP instead of TCP as the underlying transport protocol for MPI. We re-designed LAM-MPI, a public domain version of MPI, to use SCTP.We describe the advantages and disadvantages of using SCTP, the necessary modifications to the MPI middleware to use SCTP, and the performance of SCTP as compared to the stock implementation that uses TCP.

[1]  Keyur C. Shah,et al.  Concurrent Multipath Transfer Using SCTP Multihoming , .

[2]  Motohiko Matsuda,et al.  The design and implementation of an asynchronous communication mechanism for the MPI communication model , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[3]  Janardhan R. Iyengar,et al.  Retransmission policies for concurrent multipath transfer using SCTP multihoming , 2004, Proceedings. 2004 12th IEEE International Conference on Networks (ICON 2004) (IEEE Cat. No.04EX955).

[4]  Janardhan R. Iyengar,et al.  SCTP and TCP Variants : Congestion Control Under Multiple Losses � , 2003 .

[5]  Mohammed Atiquzzaman,et al.  SCTP over satellite networks , 2003, 2002 14th International Conference on Ion Implantation Technology Proceedings (IEEE Cat. No.02EX505).

[6]  George Bosilca,et al.  Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.

[7]  Mohammed Atiquzzaman,et al.  Effect of Congestion Control on the Performance of TCP and SCTP over Satellite Networks , 2002 .

[8]  R. Rajamani,et al.  SCTP versus TCP : Comparing the Performance of Transport Protocols for Web Traffic , 2022 .

[9]  Wu-chun Feng,et al.  The Failure of TCP in High-Performance Computational Grids , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[10]  Jack J. Dongarra,et al.  Building and Using a Fault-Tolerant MPI Implementation , 2004, Int. J. High Perform. Comput. Appl..

[11]  Paul D. Amer,et al.  Improving Multiple File Transfers Using SCTP Multistreaming , 2003 .

[12]  Katta G. Murty,et al.  Network programming , 1992 .

[13]  Greg Burns,et al.  LAM: An Open Cluster Environment for MPI , 2002 .

[14]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[15]  Technical Whitepaper,et al.  SLIPPING IN THE WINDOW: TCP RESET ATTACKS , 2003 .

[16]  Ilyoung Chong,et al.  An Experimental Performance Evaluation of the Stream Control Transmission Protocol for Transaction Processing in Wireless Networks , 2003, ICOIN.

[17]  Bill Fenner,et al.  UNIX Network Programming, Vol. 1 , 2003 .

[18]  Mark A. Taylor,et al.  Architecture of LA-MPI, a network-fault-tolerant MPI , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[19]  George Bosilca,et al.  TEG: A High-Performance, Scalable, Multi-network Point-to-Point Communications Methodology , 2004, PVM/MPI.

[20]  Qiaobing Xie,et al.  Stream control transmission protocol (SCTP): a reference guide , 2001 .

[21]  Sally Floyd,et al.  Measuring the evolution of transport protocols in the internet , 2005, CCRV.

[22]  Keith D. Underwood,et al.  Evaluation of an Eager Protocol Optimization for MPI , 2003, PVM/MPI.

[23]  Greg Burns,et al.  Robust MPI Message Delivery with Guaranteed Resources , 2002 .

[24]  Y. Raghu Reddy,et al.  Web100: extended TCP instrumentation for research, education and diagnosis , 2003, CCRV.

[25]  Brian D. Noble,et al.  Improving throughput and maintaining fairness using parallel TCP , 2004, IEEE INFOCOM 2004.

[26]  Stefan Savage,et al.  Modeling TCP latency , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[27]  Ian T. Foster,et al.  MPICH-G2: A Grid-enabled implementation of the Message Passing Interface , 2002, J. Parallel Distributed Comput..