A Data Throughput Prediction and Optimization Service for Widely Distributed Many-Task Computing

In this paper, we present the design and implementation of an application-layer data throughput prediction and optimization service for many-task computing in widely distributed environments. This service uses multiple parallel TCP streams to improve the end-to-end throughput of data transfers. A novel mathematical model is developed to determine the number of parallel streams, required to achieve the best network performance. This model can predict the optimal number of parallel streams with as few as three prediction points. We implement this new service in the Stork Data Scheduler, where the prediction points can be obtained using Iperf and GridFTP samplings. Our results show that the prediction cost plus the optimized transfer time is much less than the nonoptimized transfer time in most cases. As a result, Stork data transfer jobs with optimization service can be completed much earlier, compared to nonoptimized data transfer jobs.

[1]  Brian Tierney,et al.  A TCP Tuning Daemon , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[2]  Eitan Altman,et al.  Parallel TCP Sockets: Simple Model, Throughput and Validation , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[3]  Peter A. Dinda,et al.  Modeling and taming parallel TCP on the wide area network , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[4]  Miron Livny,et al.  Stork: making data placement a first class citizen in the grid , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[5]  Tevfik Kosar,et al.  Which network measurement tool is right for you? a multidimensional comparison study , 2008, 2008 9th IEEE/ACM International Conference on Grid Computing.

[6]  Richard G. Baraniuk,et al.  pathChirp: Efficient available bandwidth estimation for network paths , 2003 .

[7]  Yong Zhao,et al.  Many-task computing for grids and supercomputers , 2008, 2008 Workshop on Many-Task Computing on Grids and Supercomputers.

[8]  Tom Kelly,et al.  Scalable TCP: improving performance in highspeed wide area networks , 2003, CCRV.

[9]  M. Frans Kaashoek,et al.  A measurement study of available bandwidth estimation tools , 2003, IMC '03.

[10]  Peter A. Dinda,et al.  Characterizing and Predicting TCP Throughput on the Wide Area Network , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[11]  Brian D. Noble,et al.  Adaptive data block scheduling for parallel TCP streams , 2005, HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005..

[12]  Jon Crowcroft,et al.  Differentiated end-to-end Internet services using a weighted proportional fair sharing TCP , 1998, CCRV.

[13]  William E. Allcock,et al.  The Globus Striped GridFTP Framework and Server , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[14]  Tevfik Kosar,et al.  Prediction of Optimal Parallelism Level in Wide Area Data Transfers , 2011, IEEE Transactions on Parallel and Distributed Systems.

[15]  Srinivasan Seshan,et al.  TCP behavior of a busy Internet server: analysis and improvements , 1997, Proceedings. IEEE INFOCOM '98, the Conference on Computer Communications. Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies. Gateway to the 21st Century (Cat. No.98.

[16]  Sally Floyd,et al.  HighSpeed TCP for Large Congestion Windows , 2003, RFC.

[17]  Les Cottrell Measuring End-To-End Bandwidth with Iperf Using Web100 , 2003 .

[18]  Pascale Vicat-Blanc Primet,et al.  Experiments of Network Throughput Measurement and Forecasting Using the Network Weather , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[19]  Tevfik Kosar,et al.  A Data Throughput Prediction and Optimization Service for Widely Distributed Many-Task Computing , 2011, IEEE Transactions on Parallel and Distributed Systems.

[20]  George Yang,et al.  Network Characterization Service (NCS) , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[21]  Mary K. Vernon,et al.  Target bandwidth sharing using endhost measures , 2007, Perform. Evaluation.

[22]  Robert L. Grossman,et al.  PSockets: The Case for Application-level Network Striping for Data Intensive Applications using High Speed Wide Area Networks , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[23]  Brian D. Noble,et al.  The end-to-end performance effects of parallel TCP sockets on a lossy wide-area network , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[24]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[25]  John S. Heidemann,et al.  Effects of ensemble-TCP , 2000, CCRV.

[26]  Brian Tierney,et al.  Applied techniques for high bandwidth data transfers across wide area networks , 2001 .

[27]  Fernando Paganini,et al.  FAST TCP: from theory to experiments , 2005, IEEE Netw..

[28]  Miron Livny,et al.  Distributed computing in practice: the Condor experience: Research Articles , 2005 .