Towards Low-Latency Batched Stream Processing by Pre-Scheduling

Many stream processing frameworks have been developed to meet the requirements of real-time processing. Among them, batched stream processing frameworks are widely advocated with the consideration of their fault-tolerance, high throughput and unified runtime with batch processing. In batched stream processing frameworks, straggler, happened due to the uneven task execution time, has been regarded as a major hurdle of latency-sensitive applications. Existing straggler mitigation techniques, operating in either reactive or proactive manner, are all post-scheduling methods, and therefore inevitably result in high resource overhead or long job completion time. We notice that batched stream processing jobs are usually recurring with predictable characteristics. By exploring such a heuristic, we present a pre-scheduling straggler mitigation framework called Lever. Lever first identifies potential stragglers and evaluates nodes’ capacity by analyzing execution information of historical jobs. Then, Lever carefully pre-schedules job input data to each node before task scheduling so as to mitigate potential stragglers. We implement Lever and contribute it as an extension of Apache Spark Streaming. Our experimental results show that Lever can reduce job completion time by 30.72% to 42.19% over Spark Streaming, a widely adopted batched stream processing system and outperforms traditional techniques significantly.

[1]  F. Miyazaki,et al.  Bettering operation of dynamic systems by learning: A new control theory for servomechanism or mechatronics systems , 1984, The 23rd IEEE Conference on Decision and Control.

[2]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[3]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[4]  Jie Huang,et al.  The HiBench benchmark suite: Characterization of the MapReduce-based data analysis , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[5]  Scott Shenker,et al.  Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.

[6]  Albert G. Greenberg,et al.  Reining in the Outliers in Map-Reduce Clusters using Mantri , 2010, OSDI.

[7]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[8]  Christopher Olston,et al.  Stateful bulk processing for incremental analytics , 2010, SoCC '10.

[9]  Joseph M. Hellerstein,et al.  MapReduce Online , 2010, NSDI.

[10]  Bingsheng He,et al.  Comet: batched stream processing for data intensive distributed computing , 2010, SoCC '10.

[11]  Frank Dabek,et al.  Large-scale Incremental Processing Using Distributed Transactions and Notifications , 2010, OSDI.

[12]  Pramod Bhatotia,et al.  Incoop: MapReduce for incremental computations , 2011, SoCC.

[13]  Prashant J. Shenoy,et al.  A platform for scalable one-pass analytics using MapReduce , 2011, SIGMOD '11.

[14]  Albert G. Greenberg,et al.  Scarlett: coping with skewed content popularity in mapreduce clusters , 2011, EuroSys '11.

[15]  Lu Liu,et al.  Muppet: MapReduce-Style Processing of Fast Data , 2012, Proc. VLDB Endow..

[16]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[17]  Magdalena Balazinska,et al.  SkewTune: mitigating skew in mapreduce applications , 2012, SIGMOD Conference.

[18]  Srikanth Kandula,et al.  Reoptimizing Data Parallel Computing , 2012, NSDI.

[19]  Patrick Wendell,et al.  Sparrow: distributed, low latency scheduling , 2013, SOSP.

[20]  Scott Shenker,et al.  Usenix Association 10th Usenix Symposium on Networked Systems Design and Implementation (nsdi '13) 185 Effective Straggler Mitigation: Attack of the Clones , 2022 .

[21]  M. Abadi,et al.  Naiad: a timely dataflow system , 2013, SOSP.

[22]  Zhengping Qian,et al.  TimeStream: reliable stream computation in the cloud , 2013, EuroSys '13.

[23]  Scott Shenker,et al.  Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.

[24]  Adam Wierman,et al.  This Paper Is Included in the Proceedings of the 11th Usenix Symposium on Networked Systems Design and Implementation (nsdi '14). Grass: Trimming Stragglers in Approximation Analytics Grass: Trimming Stragglers in Approximation Analytics , 2022 .

[25]  Randy H. Katz,et al.  Wrangler: Predictable and Faster Jobs using Fewer Resources , 2014, SoCC.

[26]  Jignesh M. Patel,et al.  Storm@twitter , 2014, SIGMOD Conference.

[27]  Haoyu Tan,et al.  The golden age for popularizing big data , 2015, Science China Information Sciences.

[28]  Yuan Yao,et al.  Big data in smart cities , 2015, Science China Information Sciences.

[29]  Changjun Jiang,et al.  Towards Energy Efficiency in Heterogeneous Hadoop Clusters by Adaptive Task Assignment , 2015, 2015 IEEE 35th International Conference on Distributed Computing Systems.

[30]  Zhen Xiao,et al.  LIBRA: Lightweight Data Skew Mitigation in MapReduce , 2015, IEEE Transactions on Parallel and Distributed Systems.

[31]  Jignesh M. Patel,et al.  Twitter Heron: Stream Processing at Scale , 2015, SIGMOD Conference.

[32]  Carlo Curino,et al.  Morpheus: Towards Automated SLOs for Enterprise Clusters , 2016, OSDI.

[33]  Aditya Akella,et al.  Altruistic Scheduling in Multi-Resource Clusters , 2016, OSDI.

[34]  Zhuo Liu,et al.  Benchmarking Streaming Computation Engines: Storm, Flink and Spark Streaming , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[35]  Wing Cheong Lau,et al.  Optimization for Speculative Execution in Big Data Processing Clusters , 2017, IEEE Transactions on Parallel and Distributed Systems.

[36]  Ali Ghodsi,et al.  Drizzle: Fast and Adaptable Stream Processing at Scale , 2017, SOSP.

[37]  Dejan S. Milojicic,et al.  Adaptive scheduling of parallel jobs in spark streaming , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[38]  Changjun Jiang,et al.  Cross-Platform Resource Scheduling for Spark and MapReduce on YARN , 2017, IEEE Transactions on Computers.

[39]  Changjun Jiang,et al.  Improving Performance of Heterogeneous MapReduce Clusters with Adaptive Task Tuning , 2017, IEEE Transactions on Parallel and Distributed Systems.

[40]  Changjun Jiang,et al.  Energy Efficiency Aware Task Assignment with DVFS in Heterogeneous Hadoop Clusters , 2018, IEEE Transactions on Parallel and Distributed Systems.