Predicting the performance of gridFTP transfers

Summary form only given. Replication is a technique in data grid environment that helps to reduce access latency and network bandwidth. Replication also increases data availability and thereby enhances the reliability of the system. Selecting the best replica depends on several factors such as past behavior of the transfer, current state of the network as well as the state of disk device. Here, we develop a predictive framework with a neural network that uses the data from various sources and predicts transfer bandwidth. We compare our results with regression models and demonstrate that the neural network technique outperforms the regression model based predictors for large file transfers.

[1]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[2]  Richard Wolski,et al.  Dynamically forecasting network performance using the Network Weather Service , 1998, Cluster Computing.

[3]  Laurene V. Fausett,et al.  Fundamentals Of Neural Networks , 1994 .

[4]  Ivica Kostanic,et al.  Principles of Neurocomputing for Science and Engineering , 2000 .

[5]  Jennifer M. Schopf,et al.  Using Regression Techniques to Predict Large Data Transfers , 2003, Int. J. High Perform. Comput. Appl..

[6]  Prashant J. Shenoy,et al.  Rules of thumb in data engineering , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[7]  M. J. Quinn,et al.  Analytical performance prediction on multicomputers , 1993, Supercomputing '93.

[8]  Ian T. Foster,et al.  Predicting the performance of wide area data transfers , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[9]  Amarnath Mukherjee,et al.  Time series models for internet traffic , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[10]  J. Schopf,et al.  Structural Prediction Models for High-Performance Distributed Applications , 1997 .

[11]  Steven Tuecke,et al.  Protocols and services for distributed data-intensive science , 2002 .

[12]  Ian T. Foster,et al.  The Globus project: a status report , 1998, Proceedings Seventh Heterogeneous Computing Workshop (HCW'98).

[13]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[14]  Ian T. Foster,et al.  The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets , 2000, J. Netw. Comput. Appl..

[15]  Ian T. Foster,et al.  Replica selection in the Globus Data Grid , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.