Albatross: An efficient cloud-enabled task scheduling and execution framework using distributed message queues

Data Analytics has become very popular on large datasets in different organizations. It is inevitable to use distributed resources such as Clouds for Data Analytics and other types of data processing at larger scales. To effectively utilize all system resources, an efficient scheduler is needed, but the traditional resource managers and job schedulers are centralized and designed for larger batch jobs which are fewer in number. Frameworks such as Hadoop and Spark, which are mainly designed for Big Data analytics, have been able to allow for more diversity in job types to some extent. However, even these systems have centralized architectures and will not be able to perform well on large scales and under heavy task loads. Modern applications generate tasks at very high rates that can cause significant slowdowns on these frameworks. Additionally, over-decomposition has shown to be very useful in increasing the system utilization. In order to achieve high efficiency, scalability, and better system utilization, it is critical for a modern scheduler to be able to handle over-decomposition and run highly granular tasks. Further, to achieve high performance, Albatross is written in C/C++, which imposes a minimal overhead to the workload process as compared to languages like Java or Python. We propose Albatross, a task level scheduling and execution framework that uses a Distributed Message Queue (DMQ) for task distribution among its workers. Unlike most scheduling systems, Albatross uses a pulling approach as opposed to the common push approach. The former would let Albatross achieve a good load balancing and scalability. Furthermore, the framework has built in support for task execution dependency on workflows. Therefore, Albatross is able to run various types of workloads, including Data Analytics and HPC applications. Finally, Albatross provides data locality support. This allows the framework to achieve higher performance through minimizing the amount of unnecessary data movement on the network. Our evaluations show that Albatross outperforms Spark and Hadoop at larger scales and in the case of running higher granularity workloads.

[1]  M. Jette,et al.  Simple Linux Utility for Resource Management , 2009 .

[2]  I. Raicu,et al.  NoVoHT : a Lightweight Dynamic Persistent NoSQL Key / Value Store , 2013 .

[3]  Blesson Varghese,et al.  Executing Bag of Distributed Tasks on the Cloud: Investigating the Trade-Offs between Performance and Cost , 2014, 2014 IEEE 6th International Conference on Cloud Computing Technology and Science.

[4]  Sriram Krishnamoorthy,et al.  Scalable work stealing , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[5]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[6]  Ioan Raicu,et al.  Understanding the Performance and Potential of Cloud Computing for Scientific Applications , 2017, IEEE Transactions on Cloud Computing.

[7]  Robert D. Blumofe,et al.  Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[8]  Rajeev Thakur,et al.  Efficient disk-to-disk sorting: a case study in the decoupled execution paradigm , 2015, DISCS '15.

[9]  Wolfgang Gentzsch,et al.  Sun Grid Engine: towards creating a compute power grid , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[10]  Michael Abd-El-Malek,et al.  Omega: flexible, scalable schedulers for large compute clusters , 2013, EuroSys '13.

[11]  Ian T. Foster,et al.  Condor-G: A Computation Management Agent for Multi-Institutional Grids , 2004, Cluster Computing.

[12]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[13]  I. Raicu,et al.  MATRIX : MAny-Task computing execution fabRIc at eXascale , 2013 .

[14]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[15]  Matteo Frigo,et al.  The implementation of the Cilk-5 multithreaded language , 1998, PLDI.

[16]  Vipin Kumar,et al.  Scalable Load Balancing Techniques for Parallel Computers , 1994, J. Parallel Distributed Comput..

[17]  Ke Wang,et al.  FaBRiQ: Leveraging Distributed Hash Tables towards Distributed Publish-Subscribe Message Queues , 2015, 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC).

[18]  Ke Wang,et al.  Achieving Efficient Distributed Scheduling with Message Queues in the Cloud for Many-Task Computing and High-Performance Computing , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[19]  日経BP社,et al.  Amazon Web Services完全ソリューションガイド , 2016 .

[20]  Patrick Wendell,et al.  Sparrow: distributed, low latency scheduling , 2013, SOSP.

[21]  Yong Zhao,et al.  Many-task computing for grids and supercomputers , 2008, 2008 Workshop on Many-Task Computing on Grids and Supercomputers.

[22]  Andy B. Yoo,et al.  Approved for Public Release; Further Dissemination Unlimited X-ray Pulse Compression Using Strained Crystals X-ray Pulse Compression Using Strained Crystals , 2002 .

[23]  Michael Lang,et al.  Load‐balanced and locality‐aware scheduling for data‐intensive workloads at extreme scales , 2016, Concurr. Comput. Pract. Exp..

[24]  Lavanya Ramakrishnan,et al.  Scalable State Management for Scientific Applications in the Cloud , 2014, 2014 IEEE International Congress on Big Data.

[25]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[26]  Ke Wang,et al.  ZHT: A Light-Weight Reliable Persistent Dynamic Scalable Zero-Hop Distributed Hash Table , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[27]  Zhou Lei,et al.  The portable batch scheduler and the maui scheduler on linux clusters , 2000 .