Adaptation and Policy-Based Resource Allocation for Efficient Bulk Data Transfers in High Performance Computing Environments

Many science applications increasingly make use of data-intensive methods that require bulk data movement such as staging of large datasets in preparation for analysis on shared computational resources, remote access to large data sets, and data dissemination. Over the next 5 to 10 years, these datasets are projected to grow to exabytes of data, and continued scientific progress will depend on efficient methods for data movement between high performance computing centers. We study two techniques that improve the use of available resources for large, long-running, multi-file transfers. First, we show the effect of adaptation of transfer parameters for multi-file transfers, where the adaptation is based on recent performance. Second, we use Virtual Organization and site policies to influence the allocation of resources such as available transfer streams to clients. We show that these techniques improve completion times for large multi-file data transfers by approximately 20% over resource constrained infrastructure.

[1]  L. Evans The Large Hadron Collider , 2007 .

[2]  Ohsaki Hiroyuki,et al.  On Parameter Tuning of Data Transfer Protocol GridFTP in Wide-Area Grid Computing , 2004 .

[3]  Joshua R. Smith,et al.  LIGO: The laser interferometer gravitational-wave observatory , 2006, QELS 2006.

[4]  M. Imase,et al.  On parameter tuning of data transfer protocol GridFTP for wide-area grid computing , 2005, 2nd International Conference on Broadband Networks, 2005..

[5]  Wu-chun Feng,et al.  User-space auto-tuning for TCP flow control in computational grids , 2004, Comput. Commun..

[6]  Ann L. Chervenak,et al.  Improving Scientific Workflow Performance Using Policy Based Data Placement , 2012, 2012 IEEE International Symposium on Policies for Distributed Systems and Networks.

[7]  Miron Livny,et al.  Data placement for scientific applications in distributed environments , 2007, 2007 8th IEEE/ACM International Conference on Grid Computing.

[8]  Pascale Vicat-Blanc Primet,et al.  Flow scheduling and endpoint rate control in GridNetworks , 2009, Future Gener. Comput. Syst..

[9]  Klaus Rabbertz,et al.  Software Agents in Data and Workflow Management , 2004 .

[10]  Paul Barford,et al.  A Machine Learning Approach to TCP Throughput Prediction , 2007, IEEE/ACM Transactions on Networking.

[11]  Arie Shoshani,et al.  Adaptive Transfer Adjustment in Efficient Bulk Data Transfer Management for Climate Datasets , 2010 .

[12]  Tevfik Kosar,et al.  A Data Throughput Prediction and Optimization Service for Widely Distributed Many-Task Computing , 2011, IEEE Transactions on Parallel and Distributed Systems.

[13]  David E. Bernholdt,et al.  The earth system grid: enabling access to multimodel climate simulation data. , 2009 .

[14]  Wu-chun Feng,et al.  Dynamic Right-Sizing: An Automated, Lightweight, and Scalable Technique for Enhancing Grid Performance , 2002, Protocols for High-Speed Networks.

[15]  Mehmet Balman,et al.  Data scheduling for large scale distributed applications , 2007 .

[16]  M. Weinstein,et al.  Cost-Effectiveness of Full Medicare Coverage of Angiotensin-Converting Enzyme Inhibitors for Beneficiaries with Diabetes , 2005, Annals of Internal Medicine.

[17]  Y. Wu,et al.  PhEDEx high-throughput data transfer management system , 2006 .

[18]  Ann L. Chervenak,et al.  Scheduling data-intensive workflows on storage constrained resources , 2009, WORKS '09.

[19]  Mehmet Balman,et al.  Dynamic Adaptation of Parallelism Level in Data Transfer Scheduling , 2009, 2009 International Conference on Complex, Intelligent and Software Intensive Systems.

[20]  Shishir Bharathi,et al.  Data Staging Strategies and Their Impact on the Execution of Scientific Workflows , 2009, DADC '09.

[21]  Brian D. Noble,et al.  Adaptive data block scheduling for parallel TCP streams , 2005, HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005..

[22]  David E. Smith,et al.  Adaptive Data Transfers that Utilize Policies for Resource Sharing , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[23]  Mehmet Balman,et al.  Dynamically tuning level of parallelism in wide area data transfers , 2008, DADC '08.

[24]  Brian Tierney,et al.  A TCP Tuning Daemon , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[25]  David E. Smith,et al.  Integrating Policy with Scientific Workflow Management for Data-Intensive Applications , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[26]  Rajkumar Kettimuthu,et al.  Globus XIO pipe open driver: enabling GridFTP to leverage standard Unix tools , 2011, TG.

[27]  Jamie Shiers,et al.  The Worldwide LHC Computing Grid (worldwide LCG) , 2007, Comput. Phys. Commun..

[28]  Ann L. Chervenak,et al.  Efficient Data Staging Using Performance-Based Adaptation and Policy-Based Resource Allocation , 2014, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.