On the predictability of large transfer TCP throughput

Predicting the throughput of large TCP transfers is important for a broad class of applications. This paper focuses on the design, empirical evaluation, and analysis of TCP throughput predictors. We first classify TCP throughput prediction techniques into two categories: Formula-Based (FB) and History-Based (HB). Within each class, we develop representative prediction algorithms, which we then evaluate empirically over the RON testbed. FB prediction relies on mathematical models that express the TCP throughput as a function of the characteristics of the underlying network path. It does not rely on previous TCP transfers in the given path, and it can be performed with non-intrusive network measurements. We show, however, that the FB method is accurate only if the TCP transfer is window-limited to the point that it does not saturate the underlying path, and explain the main causes of the prediction errors. HB techniques predict the throughput of TCP flows from a time series of previous TCP throughput measurements on the same path, when such a history is available. We show that even simple HB predictors, such as Moving Average and Holt-Winters, using a history of few and sporadic samples, can be quite accurate. On the negative side, HB predictors are highly path-dependent. We explain the cause of such path dependencies based on two key factors: the load on the path and the degree of statistical multiplexing.

[1]  Jie Wu,et al.  A unified framework for understanding network traffic using independent wavelet models , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[2]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[3]  Richard Wolski,et al.  Multivariate Resource Performance Forecasting in the Network Weather Service , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[4]  Anees Shaikh,et al.  A comparison of overlay routing and multihoming route control , 2004, SIGCOMM '04.

[5]  Manish Jain,et al.  End-to-end available bandwidth: measurement methodology, dynamics, and relation with TCP throughput , 2003, TNET.

[6]  Richard G. Baraniuk,et al.  pathChirp: Efficient available bandwidth estimation for network paths , 2003 .

[7]  Srinivasan Seshan,et al.  Enabling conferencing applications on the internet using an overlay muilticast architecture , 2001, SIGCOMM '01.

[8]  M. Frans Kaashoek,et al.  A measurement study of available bandwidth estimation tools , 2003, IMC '03.

[9]  Srinivasan Seshan,et al.  Enabling conferencing applications on the internet using an overlay muilticast architecture , 2001, SIGCOMM 2001.

[10]  Peter Steenkiste,et al.  Evaluation and characterization of available bandwidth probing techniques , 2003, IEEE J. Sel. Areas Commun..

[11]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1972 .

[12]  Lili Qiu,et al.  On the placement of Web server replicas , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[13]  Anees Shaikh,et al.  A measurement-based analysis of multihoming , 2003, SIGCOMM '03.

[14]  Peter A. Dinda,et al.  An empirical study of the multiscale predictability of network traffic , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..

[15]  Chris Peterson,et al.  Implementing a Performance Forecasting System for Metacomputing The Network Weather Service , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[16]  Donald F. Towsley,et al.  Modeling TCP Reno performance: a simple model and its empirical validation , 2000, TNET.

[17]  Thomas Bonald,et al.  Statistical bandwidth sharing: a study of congestion at flow level , 2001, SIGCOMM.

[18]  Ian T. Foster,et al.  Predicting the performance of wide area data transfers , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[19]  Balachander Krishnamurthy,et al.  On the use and performance of content distribution networks , 2001, IMW '01.

[20]  Hari Balakrishnan,et al.  Resilient overlay networks , 2001, SOSP.

[21]  Yin Zhang,et al.  On the constancy of internet path properties , 2001, IMW '01.

[22]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[23]  François Baccelli,et al.  TCP throughput analysis under transmission error and congestion losses , 2004, IEEE INFOCOM 2004.

[24]  Mark Handley,et al.  Equation-based congestion control for unicast applications , 2000, SIGCOMM.

[25]  Biplab Sikdar,et al.  Analytic models for the latency and steady-state throughput of TCP tahoe, Reno, and SACK , 2003, TNET.

[26]  Stefan Savage,et al.  Modeling TCP latency , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[27]  San-qi Li,et al.  A predictability analysis of network traffic , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[28]  Mark Handley,et al.  Topologically-aware overlay construction and server selection , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[29]  Ellen W. Zegura,et al.  Application-layer anycasting: a server selection architecture and use in a replicated Web service , 2000, TNET.

[30]  Roch Guérin,et al.  Predicting TCP throughput from non-invasive network sampling , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[31]  Donald F. Towsley,et al.  Modeling TCP throughput: a simple model and its empirical validation , 1998, SIGCOMM '98.

[32]  Matthew Mathis,et al.  The macroscopic behavior of the TCP congestion avoidance algorithm , 1997, CCRV.

[33]  Jennifer M. Schopf,et al.  Predicting sporadic grid data transfers , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[34]  B. Cohen,et al.  Incentives Build Robustness in Bit-Torrent , 2003 .

[35]  Hui Zhang,et al.  Measurement-based optimization techniques for bandwidth-demanding peer-to-peer systems , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[36]  R. Tsay Outliers, Level Shifts, and Variance Changes in Time Series , 1988 .

[37]  E. Parzen Foundations of Time Series Analysis and Prediction Theory , 2002 .

[38]  Alexandre Proutière,et al.  Statistical bandwidth sharing: a study of congestion at flow level , 2001, SIGCOMM.

[39]  Marco Ajmone Marsan,et al.  A detailed and accurate closed queueing network model of many interacting TCP flows , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[40]  Hari Balakrishnan,et al.  Best-path vs. multi-path overlay routing , 2003, IMC '03.