Continuously Improving the Resource Utilization of Iterative Parallel Dataflows

Parallel dataflow systems like Apache Flink allow analysis of large datasets with iterative programs. However, allocating a cost-effective set of resources for such jobs is a difficult task as the resource utilization depends on many factors such as dataset size, key value distributions, computational complexity of programs, and the underlying hardware. What's more, some of these factors are not well known before the execution. There are, for example, often no data statistics such as key value distributions available beforehand. For this reason, we propose to improve the resource utilization at runtime using the repetitive nature of iterative dataflow programs. Based on runtime statistics gathered in previous iterations, the resource allocation is adapted dynamically at the synchronization barriers between iterations. This approach has two advantages: First, at barriers detailed statistics can be available, even for parallelly executed task pipelines. Second, at barriers dataflows can be adapted without complex handling of intermediate task state. This paper presents a prototype integrated with Apache Flink and an evaluation on a cluster with 480 cores. One experiment shows a 57% reduction of the job runtime by allocating more resources for a shorter time, another experiment a release of up to 40% surplus resources without significantly extending the job runtime.

[1]  Felix Naumann,et al.  The Stratosphere platform for big data analytics , 2014, The VLDB Journal.

[2]  M. Abadi,et al.  Naiad: a timely dataflow system , 2013, SOSP.

[3]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[4]  Cees Witteveen,et al.  ThroughputScheduler: Learning to Schedule on Heterogeneous Hadoop Clusters , 2013, ICAC.

[5]  Geoffrey C. Fox,et al.  Twister: a runtime for iterative MapReduce , 2010, HPDC '10.

[6]  Abhishek Verma,et al.  Large-scale cluster management at Google with Borg , 2015, EuroSys.

[7]  Michael Isard,et al.  Differential Dataflow , 2013, CIDR.

[8]  Srikanth Kandula,et al.  Recurring job optimization in scope , 2012, SIGMOD Conference.

[9]  Odej Kao,et al.  Elastic Stream Processing with Latency Guarantees , 2015, 2015 IEEE 35th International Conference on Distributed Computing Systems.

[10]  Michael Kaminsky,et al.  Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles , 2013, SOSP 2013.

[11]  Roberto Baldoni,et al.  Adaptive online scheduling in storm , 2013, DEBS.

[12]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[13]  Christina Delimitrou,et al.  Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.

[14]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[15]  Dominic Battré,et al.  Detecting bottlenecks in parallel DAG-based data flow programs , 2010, 2010 3rd Workshop on Many-Task Computing on Grids and Supercomputers.

[16]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[17]  Michael D. Ernst,et al.  HaLoop , 2010, Proc. VLDB Endow..

[18]  Raul Castro Fernandez,et al.  Integrating scale out and fault tolerance in stream processing using operator state management , 2013, SIGMOD '13.

[19]  Srikanth Kandula,et al.  Jockey: guaranteed job latency in data parallel clusters , 2012, EuroSys '12.

[20]  Claudio Soriente,et al.  StreamCloud: An Elastic and Scalable Data Streaming System , 2012, IEEE Transactions on Parallel and Distributed Systems.

[21]  Volker Markl,et al.  Spinning Fast Iterative Data Flows , 2012, Proc. VLDB Endow..

[22]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[23]  Nicolas Bruno,et al.  Continuous Cloud-Scale Query Optimization and Processing , 2013, Proc. VLDB Endow..

[24]  Schahram Dustdar,et al.  Esc: Towards an Elastic Stream Computing Platform for the Cloud , 2011, 2011 IEEE 4th International Conference on Cloud Computing.