Characterizing throughput bottlenecks for secure GridFTP transfers

GridFTP is the de facto standard for bulk data movement in distributed science environments. It extends the legacy FTP to provide strong security, reliability, and high performance. GridFTP, like FTP, is a two-channel protocol-the control channel is used for sending commands and responses, and the data channel is used for transferring the actual data. The control channel is encrypted and integrity protected by default. The data channel is authenticated by default. Encryption and integrity protection are both supported on the data channel but are not enabled by default because of their high CPU cost and low data transfer rates. In this paper, we present an extensive experimental study on the performance implications of enabling integrity protection and encryption on the data channel. We show that in a vast number of cases involving the use of nonthreaded Globus GridFTP servers on multicore systems, throughputs of secure transfers are not comparable to those of nonencrypted and nonintegrity-protected transfers because of an inefficient use of available processors. However, in cases where a strong desire for higher security levels permits larger expenditures in processing, integrity protection and sometimes even crypto-graphic confidentiality can be provided without having to suffer a decline in throughput. We show that this can be accomplished through threaded Globus GridFTP server instances configured with appropriately chosen parallelism and concurrency, allowing for a more effective use of available system resources.

[1]  Yin Zhang,et al.  On individual and aggregate TCP performance , 1999, Proceedings. Seventh International Conference on Network Protocols.

[2]  Marc Horowitz,et al.  FTP Security Extensions , 1997, RFC.

[3]  Ian T. Foster,et al.  Data management and transfer in high-performance computational grid environments , 2002, Parallel Comput..

[4]  Steven Tuecke,et al.  GridFTP: Protocol Extensions to FTP for the Grid , 2001 .

[5]  W.Alcock,et al.  Globus Toolkit Support for Distributed Data—Intensive Science , 2001 .

[6]  William E. Allcock,et al.  The Globus Striped GridFTP Framework and Server , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[7]  Brian D. Noble,et al.  The end-to-end performance effects of parallel TCP sockets on a lossy wide-area network , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[8]  M. Link,et al.  Harnessing Multicore Processors for High-Speed Secure Transfer , 2007, 2007 High-Speed Networks Workshop.

[9]  Ian T. Foster,et al.  Instant GridFTP , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[10]  Michel Gien,et al.  A File Transfer Protocol (FTP) , 1978, Comput. Networks.

[11]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[12]  Jason Lee,et al.  Lessons learned from moving earth system grid data sets over a 20 Gbps wide-area network , 2010, HPDC '10.

[13]  Craig Partridge,et al.  When the CRC and TCP checksum disagree , 2000, SIGCOMM.

[14]  Robert Elz,et al.  Feature negotiation mechanism for the File Transfer Protocol , 1998, RFC.