Performance Implications of Task Scheduling by Predicting Network Throughput on the Internet

A meta-scheduler is used to efficiently assign tasks to distributed resources. Additionally, the Internet is often used to share the data of tasks on the resources. Throughput prediction will play a crucial role in overcoming the instability of network resource on the Internet for the meta-scheduler. However, it is unclear that higher prediction accuracy should always guarantee better scheduling. In this paper, we focus on how predictors can affect the overall processing time of given tasks through meta-scheduler simulation. Real traces of throughput on the Internet are used to build CDF-based and SVR-based predictors, and these are adopted by the meta-scheduler. Through the simulation, the meta-scheduler using the predictor clearly reduces the processing time. Moreover, the meta-scheduler using the SVR-based, which performed better than CDF-based did in terms of throughput prediction, was observed to result in a reduction of up to 13.3% in the processing time compared to the meta-scheduler without any predictions. The expected value of performance improvement with the SVR-based predictor was calculated as 3.73%, while the value of CDF-based was 2.53%. On the other hand, the meta-scheduler using the predictor can seldom assign tasks to inappropriate sites due to only a few of inaccurate prediction results. As a result, the processing time is drastically increased in comparison with the meta-scheduler without any predictions.

[1]  Hector Garcia-Molina,et al.  Routing indices for peer-to-peer systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[2]  R. V. van Nieuwpoort,et al.  The Grid 2: Blueprint for a New Computing Infrastructure , 2003 .

[3]  Ocean Vehicles,et al.  Proceedings of the 6-th International Conference on Stability of Ships and Ocean Vehicles : 22-27 September 1997, Varna, Bulgaria , 1997 .

[4]  Qi He,et al.  On the predictability of large transfer TCP throughput , 2005, SIGCOMM '05.

[5]  Bettina Schnor,et al.  Earliest Start Time Estimation for Advance Reservation-Based Resource Brokering within Computational Grids , 2010, International Symposium on Parallel and Distributed Processing with Applications.

[6]  Eitan Altman,et al.  Parallel TCP Sockets: Simple Model, Throughput and Validation , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[7]  Maode Ma,et al.  WLAN Traffic Prediction Using Support Vector Machine , 2009, IEICE Trans. Commun..

[8]  Yoshio Tanaka,et al.  GridARS: An Advance Reservation-Based Grid Co-allocation Framework for Distributed Computing and Network Resources , 2007, JSSPP.

[9]  María Blanca Caminero,et al.  Network-aware meta-scheduling in advance with autonomous self-tuning system , 2011, Future Gener. Comput. Syst..

[10]  Thamarai Selvi Somasundaram,et al.  A Grid resource brokering strategy based on resource and network performance in Grid , 2012, Future Gener. Comput. Syst..

[11]  Emmanouel A. Varvarigos,et al.  Scheduling efficiency of resource information aggregation in grid networks , 2012, Future Gener. Comput. Syst..

[12]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[13]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[14]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[15]  Richard Wolski,et al.  Multivariate Resource Performance Forecasting in the Network Weather Service , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[16]  Masahiko Jinno,et al.  Grid Network Service-Web Services Interface Version 2 Achieving Scalable Reservation of Network Resources Across Multiple Network Domains via Management Plane , 2010, IEICE Trans. Commun..

[17]  Michael Thomas,et al.  Data Intensive and Network Aware (DIANA) Grid Scheduling , 2007, Journal of Grid Computing.

[18]  Hiroyuki Ohsaki,et al.  Peta-Flow Computing: Vision and Challenges , 2011, 2011 IEEE/IPSJ International Symposium on Applications and the Internet.

[19]  Michael Welzl,et al.  Fog in the network weather service: a case for novel approaches , 2007, GridNets '07.

[20]  Ching-Hsien Hsu,et al.  Bandwidth Sensitive Co-allocation Scheme for Parallel Downloading in Data Grid , 2009, 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications.

[21]  Eric Mannie,et al.  Generalized Multi-Protocol Label Switching (GMPLS) Architecture , 2004, RFC.

[22]  María Blanca Caminero,et al.  Network-aware heuristics for inter-domain meta-scheduling in Grids , 2011, J. Comput. Syst. Sci..

[23]  Paul Barford,et al.  A Machine Learning Approach to TCP Throughput Prediction , 2007, IEEE/ACM Transactions on Networking.

[24]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[25]  Emmanouel A. Varvarigos,et al.  Spectral Clustering Scheduling Techniques for Tasks with Strict QoS Requirements , 2008, Euro-Par.

[26]  Toshio Hirotsu,et al.  Analytical Modeling of Network Throughput Prediction on the Internet , 2012, IEICE Trans. Inf. Syst..

[27]  David M Levinson,et al.  Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering , 2009, Complex.

[28]  Ladislau Bölöni,et al.  A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems , 2001, J. Parallel Distributed Comput..