T3-Scheduler: A topology and Traffic aware two-level Scheduler for stream processing systems in a heterogeneous cluster

Abstract To efficiently handle a large volume of data, scheduling algorithms in stream processing systems need to minimise the data movement between communicating tasks to improve system throughput. However, finding an optimal scheduling algorithm for these systems is NP-hard. In this paper, we propose a heuristic scheduling algorithm – T3-Scheduler – for a heterogeneous fog or cloud cluster that can efficiently identify the tasks that communicate with each other and assign them to the same node, up to a specified level of utilisation for that node. Using three common micro-benchmarks and an evaluation using two real-world applications, we demonstrate that T3-Scheduler outperforms current state-of-the-art scheduling algorithms, such as Aniello et al.’s popular ‘Online scheduler’ and R-Storm, improving throughput by up to 32% for the two real-world applications. 1

[1]  Jian Tang,et al.  T-Storm: Traffic-Aware Online Scheduling in Storm , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[2]  Abraham Bernstein,et al.  Workload scheduling in distributed stream processors using graph partitioning , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[3]  Keqin Li,et al.  Re-Stream: Real-time and energy-efficient resource scheduling in big data stream computing environments , 2015, Inf. Sci..

[4]  Shirish Tatikonda,et al.  From "Think Like a Vertex" to "Think Like a Graph" , 2013, Proc. VLDB Endow..

[5]  Amar Shan,et al.  Heterogeneous processing: a strategy for augmenting moore's law , 2006 .

[6]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[7]  Wesley W. Chu,et al.  Task Allocation in Distributed Data Processing , 1980, Computer.

[8]  David M. Eyers,et al.  A Topology and Traffic Aware Two-Level Scheduler for Stream Processing Systems in a Heterogeneous Cluster , 2017, Euro-Par Workshops.

[9]  Jennifer Widom,et al.  Operator placement for in-network stream query processing , 2005, PODS.

[10]  Kun-Lung Wu,et al.  COLA: Optimizing Stream Processing Applications via Graph Partitioning , 2009, Middleware.

[11]  Daniel J. Abadi,et al.  Scalable SPARQL querying of large RDF graphs , 2011, Proc. VLDB Endow..

[12]  Holger Ziekow,et al.  The DEBS 2015 grand challenge , 2015, DEBS.

[13]  Leonardo Neumeyer,et al.  S4: Distributed Stream Computing Platform , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[14]  Thomas S. Heinze,et al.  Tutorial: cloud-based data stream processing , 2014 .

[15]  Vincenzo Grassi,et al.  Optimal operator placement for distributed stream processing applications , 2016, DEBS.

[16]  Holger Ziekow,et al.  The DEBS 2014 grand challenge , 2014, DEBS '14.

[17]  Vincenzo Grassi,et al.  Optimal Operator Replication and Placement for Distributed Stream Processing Systems , 2017, PERV.

[18]  Osamu Tatebe,et al.  Workflow Scheduling to Minimize Data Movement Using Multi-constraint Graph Partitioning , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[19]  Frank Dürr,et al.  Solving the Multi-Operator Placement Problem in Large-Scale Operator Networks , 2010, 2010 Proceedings of 19th International Conference on Computer Communications and Networks.

[20]  Margo I. Seltzer,et al.  Network-Aware Operator Placement for Stream-Processing Systems , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[21]  Shusen Yang,et al.  IoT Stream Processing and Analytics in the Fog , 2017, IEEE Communications Magazine.

[22]  Jignesh M. Patel,et al.  Twitter Heron: Stream Processing at Scale , 2015, SIGMOD Conference.

[23]  Pavel Smrz,et al.  Heterogeneity-aware scheduler for stream processing frameworks , 2015, Int. J. Big Data Intell..

[24]  Stratis Viglas,et al.  Fast Heuristics for Near-Optimal Task Allocation in Data Stream Processing over Clusters , 2014, CIKM.

[25]  Yoonho Park,et al.  SPC: a distributed, scalable platform for data mining , 2006, DMSSP '06.

[26]  Andrew V. Goldberg,et al.  Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.

[27]  Sateesh Addepalli,et al.  Fog computing and its role in the internet of things , 2012, MCC '12.

[28]  Sharma Chakravarthy,et al.  Stream Data Processing: A Quality of Service Perspective - Modeling, Scheduling, Load Shedding, and Complex Event Processing , 2009, Advances in Database Systems.

[29]  Vincenzo Grassi,et al.  Distributed QoS-aware scheduling in storm , 2015, DEBS.

[30]  Jie Liu,et al.  Greedy is Good: On Service Tree Placement for In-Network Stream Processing , 2006, 26th IEEE International Conference on Distributed Computing Systems (ICDCS'06).

[31]  Yin Yang,et al.  DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams , 2015, 2015 IEEE 35th International Conference on Distributed Computing Systems.

[32]  R. Srikant,et al.  Scheduling Storms and Streams in the Cloud , 2015, SIGMETRICS.

[33]  George Karypis,et al.  Multilevel k-way Partitioning Scheme for Irregular Graphs , 1998, J. Parallel Distributed Comput..

[34]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[35]  Thomas Locher,et al.  Task allocation for distributed stream processing , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[36]  Harry R. Lewis,et al.  ΠGarey Michael R. and Johnson David S.. Computers and intractability. A guide to the theory of NP-completeness . W. H. Freeman and Company, San Francisco 1979, x + 338 pp. , 1983 .

[37]  Noel De Palma,et al.  Locality-Aware Routing in Stateful Streaming Applications , 2016, Middleware.

[38]  Daniel Kuhn,et al.  SQPR: Stream query planning with reuse , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[39]  Ying Li,et al.  Placement Strategies for Internet-Scale Data Stream Systems , 2008, IEEE Internet Computing.

[40]  Joseph M. Hellerstein,et al.  Distributed GraphLab: A Framework for Machine Learning in the Cloud , 2012, Proc. VLDB Endow..

[41]  Lu Wang,et al.  How to partition a billion-node graph , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[42]  Mohammad Hosseini,et al.  R-Storm: Resource-Aware Scheduling in Storm , 2015, Middleware.

[43]  Zhiyi Huang,et al.  P-Scheduler: adaptive hierarchical scheduling in apache storm , 2016, ACSW.

[44]  Jiang Zhu,et al.  Fog Computing: A Platform for Internet of Things and Analytics , 2014, Big Data and Internet of Things.

[45]  Kun-Lung Wu,et al.  SODA: An Optimizing Scheduler for Large-Scale Stream-Based Distributed Computer Systems , 2008, Middleware.