On the importance of bandwidth control mechanisms for scheduling on large scale heterogeneous platforms

We study three scheduling problems (file redistribution, independent tasks scheduling and broadcasting) on large scale heterogeneous platforms under the Bounded Multi-port Model. In this model, each node is associated to an incoming and outgoing bandwidth and it can be involved in an arbitrary number of communications, provided that neither its incoming nor its outgoing bandwidths are exceeded. This model well corresponds to modern networking technologies, it can be used when programming at TCP level and is also implemented in modern message passing libraries such as MPICH2. We prove, using the three above mentioned scheduling problems, that this model is tractable and that even very simple distributed algorithms can achieve optimal performance, provided that we can enforce bandwidth sharing policies. Our goal is to assert the necessity of such QoS mechanisms, that are now available in the kernels of modern operating systems, to achieve optimal performance. We prove that implementations of optimal algorithms that do not enforce prescribed bandwidth sharing can fail by a large amount if TCP contention mechanisms only are used. More precisely, for each considered scheduling problem, we establish upper bounds on the performance loss than can be induced by TCP bandwidth sharing mechanisms, we prove that these upper bounds are tight by exhibiting instances achieving them and we provide a set of simulations using SimGRID to analyze the practical impact of bandwidth control mechanisms.

[1]  Viktor K. Prasanna,et al.  Distributed adaptive task allocation in heterogeneous computing environments to maximize throughput , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[2]  Ioana Manolescu,et al.  Proceedings of the 11th international conference on Extending database technology: Advances in database technology , 2008, EDBT2008 2008.

[3]  Harold N. Gabow,et al.  Packing algorithms for arborescences (and spanning trees) in capacitated graphs , 1995, Math. Program..

[4]  Donald F. Towsley,et al.  Modeling TCP throughput: a simple model and its empirical validation , 1998, SIGCOMM '98.

[5]  Miguel Castro,et al.  SplitStream: high-bandwidth multicast in cooperative environments , 2003, SOSP '03.

[6]  Baruch Awerbuch,et al.  Improved approximation algorithms for the multi-commodity flow problem and local competitive routing in dynamic networks , 1994, STOC '94.

[7]  Baruch Awerbuch,et al.  A simple local-control approximation algorithm for multicommodity flow , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[8]  Matthew Mathis,et al.  The macroscopic behavior of the TCP congestion avoidance algorithm , 1997, CCRV.

[9]  Larry Carter,et al.  Scheduling strategies for master-slave tasking on heterogeneous processor platforms , 2004, IEEE Transactions on Parallel and Distributed Systems.

[10]  Vijay S. Pande,et al.  Folding@Home and Genome@Home: Using distributed computing to tackle previously intractable problem , 2009, 0901.0866.

[11]  Robert A. van de Geijn,et al.  A Pipelined Broadcast for Multidimensional Meshes , 1995, Parallel Process. Lett..

[12]  Laurent Massoulié,et al.  Bandwidth sharing: objectives and algorithms , 2002, TNET.

[13]  Henri Casanova,et al.  Network modeling issues for grid application scheduling , 2005, Int. J. Found. Comput. Sci..

[14]  Yu-Chee Tseng,et al.  Efficient Broadcasting in Wormhole-Routed Multicomputers: A Network-Partitioning Approach , 1999, IEEE Trans. Parallel Distributed Syst..

[15]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[16]  Laurent Massoulié,et al.  Randomized Decentralized Broadcasting Algorithms , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[17]  S.A. Brandt,et al.  CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[18]  E.L. Lawler,et al.  Optimization and Approximation in Deterministic Sequencing and Scheduling: a Survey , 1977 .

[19]  Frédéric Vivien,et al.  A First Step Towards Automatically Building Network Representations , 2007, Euro-Par.

[20]  Ricardo Baeza-Yates,et al.  Data challenges at Yahoo! , 2008, EDBT '08.

[21]  S. Lennart Johnsson,et al.  Optimum Broadcasting and Personalized Communication in Hypercubes , 1989, IEEE Trans. Computers.

[22]  B. S. Li,et al.  CoolStreaming/DONet: A dData-driven overlay network for live media streaming , 2004 .

[23]  Zongpeng Li,et al.  On achieving optimal throughput with network coding , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[24]  D. R. Fulkerson,et al.  On edge-disjoint branchings , 1976, Networks.

[25]  Francine Berman,et al.  Using Effective Network Views to Promote Distributed Application Performance , 1999, PDPTA.

[26]  Henri Casanova,et al.  A Network Model for Simulation of Grid Application , 2002 .

[27]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[28]  Olivier Beaumont,et al.  Scheduling Techniques for Effective System Reconfiguration in Distributed Storage Systems , 2008, 2008 14th IEEE International Conference on Parallel and Distributed Systems.