Utility-Based Scheduling for Bulk Data Transfers between Distributed Computing Facilities

Today's scientific applications increasingly involve large amounts of input/output data that must be moved among multiple computing facilities via wide-area networks (WANs). The bandwidth of WANs, however, is growing at a much smaller rate and thus becoming a bottleneck. Moreover, the network bandwidth has not been viewed as a limited resource, and thus coordinated allocation is lacking. Uncoordinated scheduling of competing data transfers over shared network links results in suboptimal system performance and poor user experiences. To address these problems, we propose a data transfer scheduler to coordinate and schedule data transfers between distributed computing facilities over WANs. Specifically, the scheduler prioritizes and allocates resources to data transfer requests based on user-centric utility functions in order to achieve maximum overall user satisfaction. We conducted trace-based simulation and demonstrated that our data transfer scheduling algorithms can considerably improve data transfer performance as well as quantified user satisfaction compared with traditional first-come, first-serve or short-job-first approaches.

[1]  Joel H. Saltz,et al.  Scheduling File Transfers for Data-Intensive Jobs on Heterogeneous Clusters , 2007, Euro-Par.

[2]  Sébastien Monnet,et al.  Performing accurate simulations for deadline-aware applications , 2013, 2013 International Conference on High Performance Computing & Simulation (HPCS).

[3]  Alex X. Liu,et al.  Multiple bulk data transfers scheduling among datacenters , 2014, Comput. Networks.

[4]  Robert B. Ross,et al.  Data-Aware Resource Scheduling for Multicloud Workflows: A Fine-Grained Simulation Approach , 2014, 2014 IEEE 6th International Conference on Cloud Computing Technology and Science.

[5]  Michael Sirivianos,et al.  Inter-datacenter bulk transfers with netstitcher , 2011, SIGCOMM.

[6]  Tevfik Kosar,et al.  End-to-End Data-Flow Parallelism for Throughput Optimization in High-Speed Networks , 2012, Journal of Grid Computing.

[7]  Robert B. Ross,et al.  CODES: Enabling Co-Design of Multi-Layer Exascale Storage Architectures , 2011 .

[8]  Ian T. Foster,et al.  Software as a service for data scientists , 2012, Commun. ACM.

[9]  Wenji Wu,et al.  An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations , 2011 .

[10]  Christopher D. Carothers,et al.  Modeling Billion-Node Torus Networks Using Massively Parallel Discrete-Event Simulation , 2011, 2011 IEEE Workshop on Principles of Advanced and Distributed Simulation.

[11]  Robert B. Ross,et al.  Modeling a Million-Node Dragonfly Network Using Massively Parallel Discrete-Event Simulation , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[12]  Raj Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[13]  Tevfik Kosar,et al.  Dynamic Protocol Tuning Algorithms for High Performance Data Transfers , 2013, Euro-Par.

[14]  Fang Chen,et al.  A utility-based approach to scheduling multimedia streams in peer-to-peer systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[15]  Wu-chun Feng,et al.  Optimizing GridFTP through dynamic right-sizing , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[16]  E. Douglas Jensen Asynchronous Decentralized Real-Time Computer Systems , 2000 .

[17]  Wei Guo,et al.  Dynamic Scheduling Algorithms for Large File Transfer on Multi-user Optical Grid Network Based on Efficiency and Fairness , 2009, 2009 Fifth International Conference on Networking and Services.

[18]  Xiaoyuan Yang,et al.  Inter-datacenter bulk transfers with netstitcher , 2011 .

[19]  P. Sadayappan,et al.  Modeling and Optimizing Large-Scale Wide-Area Data Transfers , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[20]  Benjamin Hindman,et al.  Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.

[21]  Tevfik Kosar,et al.  Data Management Challenges in Coastal Applications , 2007 .

[22]  Cynthia Bailey Lee,et al.  Precise and realistic utility functions for user-centric performance analysis of schedulers , 2007, HPDC '07.

[23]  Mehmet Balman,et al.  Dynamic Adaptation of Parallelism Level in Data Transfer Scheduling , 2009, 2009 International Conference on Complex, Intelligent and Software Intensive Systems.

[24]  Christopher D. Carothers,et al.  ROSS: a high-performance, low memory, modular time warp system , 2000, Proceedings Fourteenth Workshop on Parallel and Distributed Simulation.

[25]  Edward G. Coffman,et al.  Scheduling file transfers in a distributed network , 1983, PODC '83.

[26]  Phil Andrews,et al.  Project Summary: XSEDE: eXtreme Science and Engineering Discovery Environment , 2010 .

[27]  Binoy Ravindran,et al.  Time-utility function-driven switched Ethernet: packet scheduling algorithm, implementation, and feasibility analysis , 2004, IEEE Transactions on Parallel and Distributed Systems.

[28]  Joel H. Saltz,et al.  Using overlays for efficient data transfer over shared wide-area networks , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[29]  Nicholas Bambos,et al.  Adaptive data-aware utility-based scheduling in resource-constrained systems , 2010, J. Parallel Distributed Comput..