Radar: Reducing Tail Latencies for Batched Stream Processing with Blank Scheduling

Real time processing of stream data has become increasingly vital. Batched stream systems which discretize stream data into micro-batches and leverage batch system to process these micro-batch stream jobs have attracted wide attention from academia and industry. Such batched stream system always works on heterogeneous environments which have heterogeneous resources and heterogeneous tasks. Unfortunately, current batched stream system implementations designed and optimized for homogeneous environments perform poorly on heterogeneous environments. We attribute suboptimal performance in heterogeneous environments to schedule tasks according to data locality and free slots. On the one hand, data locality creates a barrier between large tasks of slow node and powerful capacity of fast node because slow nodes prefer local large tasks rather than remote small tasks. On another hand, due to scheduler's blind eye to task size, there is a very high probability that large tasks are scheduled in the last few waves. These two aspects hinder perfect load balancing, causing tail latencies of large tasks. To address these issues, we propose a blank scheduling framework called Radar. Being aware of node capacity and task size, Radar pre-steals large tasks from slow nodes and schedules tasks according to the principle of large task first. Then Radar fills the small free slots by choosing small tasks corresponding to node's capacity. We implement Radar in Spark-2.1.1. Experimental results with benchmark show that Radar can reduce job completion time by 27.78% to 42.79% over Spark Streaming. Experimental results with real Tencent production application show that Radar can reduce response time by 28.57%.

[1]  J. H. Hsiao,et al.  A usage-aware scheduler for improving MapReduce performance in heterogeneous environments , 2014, 2014 International Conference on Information Science, Electronics and Electrical Engineering.

[2]  Scott Shenker,et al.  Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.

[3]  Joseph M. Hellerstein,et al.  MapReduce Online , 2010, NSDI.

[4]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[5]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[6]  Jin Towards Low-Latency Batched Stream Processing by Pre-Scheduling , 2018 .

[7]  Zhuo Liu,et al.  Benchmarking Streaming Computation Engines: Storm, Flink and Spark Streaming , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[8]  Changjun Jiang,et al.  Improving Performance of Heterogeneous MapReduce Clusters with Adaptive Task Tuning , 2017, IEEE Transactions on Parallel and Distributed Systems.

[9]  Xiaobo Zhou,et al.  Addressing Performance Heterogeneity in MapReduce Clusters with Elastic Tasks , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[10]  Jignesh M. Patel,et al.  Twitter Heron: Stream Processing at Scale , 2015, SIGMOD Conference.

[11]  T. N. Vijaykumar,et al.  Tarazu: optimizing MapReduce on heterogeneous clusters , 2012, ASPLOS XVII.

[12]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[13]  M. Abadi,et al.  Naiad: a timely dataflow system , 2013, SOSP.

[14]  Scott Shenker,et al.  Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.

[15]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[16]  Magdalena Balazinska,et al.  SkewTune: mitigating skew in mapreduce applications , 2012, SIGMOD Conference.

[17]  Bingsheng He,et al.  Comet: batched stream processing for data intensive distributed computing , 2010, SoCC '10.