The sandpile scheduler

This paper studies a self-organized criticality model called sandpile for dynamically load-balancing tasks arriving in the form of Bag-of-Tasks in large-scale decentralized system. The sandpile is designed as a decentralized agent system characterizing a cellular automaton, which works in a critical state at the edge of chaos. Depending on the state of the cellular automaton, different responses may occur when a new task is assigned to a resource: it may change nothing or generate avalanches that reconfigure the state of the system. The abundance of such avalanches is in power-law relation with their sizes, a scale-invariant behavior that emerges without requiring tuning or control parameters. That means that large—catastrophic—avalanches are very rare but small ones occur very often. Such emergent pattern can be efficiently adapted for non-clairvoyant scheduling, where tasks are load balanced in computing resources trying to maximize the performance but without assuming any knowledge on the tasks features. The algorithm design is experimentally validated showing that the sandpile is able to find near-optimal schedules by reacting differently to different conditions of workloads and architectures.

[1]  Pascal Bouvry,et al.  Designing a Self-Organized Approach for Scheduling Bag-of-Tasks , 2012, 2012 Seventh International Conference on P2P, Parallel, Grid, Cloud and Internet Computing.

[2]  Márk Jelasity,et al.  A Modular Paradigm for Building Self-Organizing Peer-to-Peer Applications , 2003, Engineering Self-Organising Systems.

[3]  Karen D. Devinea,et al.  New Challenges in Dynamic Load Balancing , 2004 .

[4]  Alexandru Iosup,et al.  The performance of bags-of-tasks in large-scale distributed systems , 2008, HPDC '08.

[5]  Bak,et al.  Punctuated equilibrium and criticality in a simple model of evolution. , 1993, Physical review letters.

[6]  Henri Casanova,et al.  Non-clairvoyant Scheduling of Multiple Bag-of-Tasks Applications , 2010, Euro-Par.

[7]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[8]  Martin Suter,et al.  Small World , 2002 .

[9]  Juan Luis,et al.  Peer-to-peer evolutionary computation: a study of viability , 2010 .

[10]  H. Herrmann,et al.  Self-organized criticality on small world networks , 2001, cond-mat/0110239.

[11]  Alessandro Giua,et al.  Load balancing on networks with gossip-based distributed ]algorithms , 2007, 2007 46th IEEE Conference on Decision and Control.

[12]  Tang,et al.  Self-Organized Criticality: An Explanation of 1/f Noise , 2011 .

[13]  Marie-Pierre Gleizes,et al.  Engineering Self-organising Systems , 2011, Self-organising Software.

[14]  Jie Hu,et al.  Decentralized Load Balancing on Unstructured Peer-2-Peer Computing Grids , 2006, Fifth IEEE International Symposium on Network Computing and Applications (NCA'06).

[15]  Max Donath,et al.  American Control Conference , 1993 .

[16]  Anne-Marie Kermarrec,et al.  Epidemic information dissemination in distributed systems , 2004, Computer.

[17]  Anne-Marie Kermarrec,et al.  The Peer Sampling Service: Experimental Evaluation of Unstructured Gossip-Based Implementations , 2004, Middleware.

[18]  Hans-Arno Jacobsen Proceedings of the 5th ACM/IFIP/USENIX international conference on Middleware , 2004 .

[19]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[20]  J. D. Teresco,et al.  New challanges in dynamic load balancing , 2005 .

[21]  Scott Shenker,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[22]  Anthony P. Reeves,et al.  Strategies for Dynamic Load Balancing on Highly Parallel Computers , 1993, IEEE Trans. Parallel Distributed Syst..

[23]  Alan D. George,et al.  GEMS: Gossip-Enabled Monitoring Service for Scalable Heterogeneous Distributed Systems , 2006, Cluster Computing.

[24]  Ling-Yun Chiao,et al.  Long-range connective sandpile models and its implication to seismicity evolution , 2008 .

[25]  Alessandro Giua,et al.  Load balancing over heterogeneous networks with gossip-based algorithms , 2009, 2009 American Control Conference.