Handling Iterations in Distributed Dataflow Systems

Over the past decade, distributed dataflow systems (DDS) have become a standard technology. In these systems, users write programs in restricted dataflow programming models, such as MapReduce, which enable them to scale out program execution to a shared-nothing cluster of machines. Yet, there is no established consensus that prescribes how to extend these programming models to support iterative algorithms. In this survey, we review the research literature and identify how DDS handle control flow, such as iteration, from both the programming model and execution level perspectives. This survey will be of interest for both users and designers of DDS.

[1]  David Cunningham,et al.  M3R: Increased performance for in-memory Hadoop jobs , 2012, Proc. VLDB Endow..

[2]  John Liagouris,et al.  Explaining Outputs in Modern Data Analytics , 2016, Proc. VLDB Endow..

[3]  Yogesh L. Simmhan,et al.  Scalable Graph Processing Frameworks , 2018, ACM Comput. Surv..

[4]  Tilmann Rabl,et al.  Efficient Control Flow in Dataflow Systems: When Ease-of-Use Meets High Performance , 2021, 2021 IEEE 37th International Conference on Data Engineering (ICDE).

[5]  Fei Wang,et al.  AutoGraph: Imperative-style Coding with Graph-based Performance , 2018, SysML.

[6]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[7]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[8]  Volker Markl,et al.  Spinning Fast Iterative Data Flows , 2012, Proc. VLDB Endow..

[9]  Michael D. Ernst,et al.  The HaLoop approach to large-scale iterative data analysis , 2012, The VLDB Journal.

[10]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[11]  Geoffrey C. Fox,et al.  MapReduce for Data Intensive Scientific Analyses , 2008, 2008 IEEE Fourth International Conference on eScience.

[12]  Steven Feuerstein,et al.  Oracle PL/SQL Programming , 1993 .

[13]  Magdalena Balazinska,et al.  Asynchronous and Fault-Tolerant Recursive Datalog Evaluation in Shared-Nothing Engines , 2015, Proc. VLDB Endow..

[14]  Stefanie Lindstaedt,et al.  SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle , 2020, CIDR.

[15]  Letizia Tanca,et al.  What you Always Wanted to Know About Datalog (And Never Dared to Ask) , 1989, IEEE Trans. Knowl. Data Eng..

[16]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[17]  Alan Edelman,et al.  Julia: A Fresh Approach to Numerical Computing , 2014, SIAM Rev..

[18]  Leonidas Fegaras,et al.  An algebra for distributed Big Data analytics , 2017, Journal of Functional Programming.

[19]  Eugene Burmako,et al.  Scala macros: let our powers combine!: on how rich syntax and static types work with metaprogramming , 2013, SCALA@ECOOP.

[20]  Sudipto Guha,et al.  REX: Recursive, Delta-Based Data-Centric Computation , 2012, Proc. VLDB Endow..

[21]  Volker Markl,et al.  A survey of state management in big data processing systems , 2017, The VLDB Journal.

[22]  Byung-Gon Chun,et al.  JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs , 2018, NSDI.

[23]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[24]  Neoklis Polyzotis,et al.  Scaling Datalog for Machine Learning on Big Data , 2012, ArXiv.

[25]  Rakesh Agrawal Alpha: An Extension of Relational Algebra to Express a Class of Recursive Queries , 1988, IEEE Trans. Software Eng..

[26]  Dan Suciu,et al.  Demonstration of the Myria big data management service , 2014, SIGMOD Conference.

[27]  Arvind,et al.  Executing a Program on the MIT Tagged-Token Dataflow Architecture , 1990, IEEE Trans. Computers.

[28]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[29]  Joseph M. Hellerstein,et al.  Distributed GraphLab: A Framework for Machine Learning in the Cloud , 2012, Proc. VLDB Endow..

[30]  Alfred V. Aho,et al.  Universality of data retrieval languages , 1979, POPL.

[31]  Michael Isard,et al.  Scalability! But at what COST? , 2015, HotOS.

[32]  Geoffrey C. Fox,et al.  Architecture and performance of runtime environments for data intensive scalable computing , 2010 .

[33]  Martin Odersky,et al.  Scala-virtualized , 2012, PEPM '12.

[34]  Sebastian Pop,et al.  SSA-based Compiler Design , 2016 .

[35]  Brian Beckman,et al.  LINQ: reconciling object, relations and XML in the .NET framework , 2006, SIGMOD Conference.

[36]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[37]  Yanfeng Zhang,et al.  iMapReduce: A Distributed Computing Framework for Iterative Computation , 2011, Journal of Grid Computing.

[38]  Craig Chambers,et al.  FlumeJava: easy, efficient data-parallel pipelines , 2010, PLDI '10.

[39]  Tilmann Rabl,et al.  Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Systems , 2017, BTW.

[40]  Haibo Chen,et al.  SYNC or ASYNC: time to fuse for distributed graph-parallel computation , 2015, PPoPP.

[41]  Geoffrey C. Fox,et al.  Twister: a runtime for iterative MapReduce , 2010, HPDC '10.

[42]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[43]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[44]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[45]  Samy Bengio,et al.  Revisiting Distributed Synchronous SGD , 2016, ArXiv.

[46]  Tilmann Rabl,et al.  Labyrinth: Compiling Imperative Control Flow to Parallel Dataflows , 2018, ArXiv.

[47]  Jean-Philippe Martin,et al.  Dandelion: a compiler and runtime for heterogeneous systems , 2013, SOSP.

[48]  Jennifer Widom,et al.  Optimizing Graph Algorithms on Pregel-like Systems , 2014, Proc. VLDB Endow..

[49]  R. Ramakrishnan,et al.  An amateur's introduction to recursive query processing strategies , 1986, SIGMOD '86.

[50]  Paolo Papotti,et al.  RHEEM: Enabling Cross-Platform Data Processing - May The Big Data Be With You! - , 2018, Proc. VLDB Endow..

[51]  Sherif Sakr,et al.  Large scale graph processing systems: survey and an experimental evaluation , 2015, Cluster Computing.

[52]  Jae-Gil Lee,et al.  An Experimental Comparison of Iterative MapReduce Frameworks , 2016, CIKM.

[53]  Sanjay Chawla,et al.  A Cost-based Optimizer for Gradient Descent Optimization , 2017, SIGMOD Conference.

[54]  Monica S. Lam,et al.  SociaLite: Datalog extensions for efficient social network analysis , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[55]  Yuanyuan Tian,et al.  Big Graph Analytics Platforms , 2017, Found. Trends Databases.

[56]  Michael Isard,et al.  Composable Incremental and Iterative Data-Parallel Computation with Naiad , 2012 .

[57]  Chen Xu,et al.  Optimistic Recovery for Iterative Dataflows in Action , 2015, SIGMOD Conference.

[58]  Oleksandr Pochayevets,et al.  BMDFM: a hybrid dataflow runtime parallelization environment for shared memory multiprocessors , 2006 .

[59]  Geoffrey C. Fox,et al.  DryadLINQ for Scientific Analyses , 2009, 2009 Fifth IEEE International Conference on e-Science.

[60]  Steven Hand,et al.  Musketeer: all for one, one for all in data processing systems , 2015, EuroSys.

[61]  Afaf G. Bin Saadon,et al.  Survey on iterative and incremental approaches in distributed computing environment , 2019 .

[62]  Torsten Grust,et al.  PL/SQL Without the PL , 2020, SIGMOD Conference.

[63]  Alexander J. Smola,et al.  An architecture for parallel topic models , 2010, Proc. VLDB Endow..

[64]  Felix Naumann,et al.  The Stratosphere platform for big data analytics , 2014, The VLDB Journal.

[65]  Shirish Tatikonda,et al.  SystemML: Declarative Machine Learning on Spark , 2016, Proc. VLDB Endow..

[66]  Neoklis Polyzotis,et al.  Declarative Systems for Large-Scale Machine Learning , 2012, IEEE Data Eng. Bull..

[67]  Peter J. Haas,et al.  Compressed Linear Algebra for Large-Scale Machine Learning , 2016, Proc. VLDB Endow..

[68]  Paolo Papotti,et al.  Road to Freedom in Big Data Analytics , 2016, EDBT.

[69]  M. Tamer Özsu,et al.  An Experimental Comparison of Pregel-like Graph Processing Systems , 2014, Proc. VLDB Endow..

[70]  Volker Markl,et al.  Iterative parallel data processing with stratosphere: an inside look , 2013, SIGMOD '13.

[71]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[72]  Martin Odersky,et al.  Higher-order and Symbolic Computation Manuscript No. Scala-virtualized: Linguistic Reuse for Deep Embeddings , 2022 .

[73]  Shirish Tatikonda,et al.  SystemML: Declarative machine learning on MapReduce , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[74]  Andrew Eisenberg New standard for stored procedures in SQL , 1996, SGMD.

[75]  Laurence Tratt,et al.  Domain specific language implementation via compile-time meta-programming , 2008, TOPL.

[76]  Michael Isard,et al.  Optimus: a dynamic rewriting framework for data-parallel execution plans , 2013, EuroSys '13.

[77]  Volker Markl,et al.  Representations and Optimizations for Embedded Parallel Dataflow Languages , 2019, ACM Trans. Database Syst..

[78]  Martin Odersky,et al.  Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs , 2010, GPCE '10.

[79]  Seif Haridi,et al.  Apache Flink™: Stream and Batch Processing in a Single Engine , 2015, IEEE Data Eng. Bull..

[80]  Wim Lamotte,et al.  Automatic Parallelization of Probabilistic Models with Varying Load Imbalance , 2020, 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID).

[81]  Ge Yu,et al.  Automating Incremental and Asynchronous Evaluation for Recursive Aggregate Data Processing , 2020, SIGMOD Conference.

[82]  Joseph K. Bradley,et al.  Spark SQL: Relational Data Processing in Spark , 2015, SIGMOD Conference.

[83]  V. Markl,et al.  The Power of Nested Parallelism in Big Data Processing – Hitting Three Flies with One Slap – , 2021, SIGMOD Conference.

[84]  Erhard Rahm,et al.  Management and Analysis of Big Graph Data: Current Systems and Open Challenges , 2017, Handbook of Big Data Technologies.

[85]  Tao Wang,et al.  Parallel Materialization of Datalog Programs with Spark for Scalable Reasoning , 2016, WISE.

[86]  Reynold Xin,et al.  Apache Spark , 2016 .

[87]  Steven Hand,et al.  Scripting the Cloud with Skywriting , 2010, HotCloud.

[88]  Xin He,et al.  Flint: batch-interactive data-intensive processing on transient servers , 2016, EuroSys.

[89]  Torsten Grust,et al.  Compiling PL/SQL Away , 2019, CIDR.

[90]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[91]  Leonid Libkin,et al.  Expressive power of SQL , 2001, Theor. Comput. Sci..

[92]  Martín Abadi,et al.  Dynamic control flow in large-scale machine learning , 2018, EuroSys.

[93]  Dan Suciu,et al.  Optimizing Large-Scale Semi-Naïve Datalog Evaluation in Hadoop , 2012, Datalog.

[94]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[95]  Yanfeng Zhang,et al.  PrIter: A Distributed Framework for Prioritizing Iterative Computations , 2011, IEEE Transactions on Parallel and Distributed Systems.

[96]  Carlo Zaniolo,et al.  Big Data Analytics with Datalog Queries on Spark , 2016, SIGMOD Conference.

[97]  Krzysztof Stencel,et al.  Recursive Query Facilities in Relational Databases: A Survey , 2010, FGIT-DTA/BSBT.

[98]  Tiark Rompf,et al.  The 800 Pound Python in the Machine Learning Room , 2018 .

[99]  Jeffrey Xu Yu,et al.  All-in-One: Graph Processing in RDBMSs Revisited , 2017, SIGMOD Conference.

[100]  Torsten Grust,et al.  One WITH RECURSIVE is Worth Many GOTOs , 2021, SIGMOD Conference.

[101]  Paolo Papotti,et al.  Rheem: Enabling Multi-Platform Task Execution , 2016, SIGMOD Conference.

[102]  Tim Weninger,et al.  Thinking Like a Vertex , 2015, ACM Comput. Surv..

[103]  M. Abadi,et al.  Naiad: a timely dataflow system , 2013, SOSP.

[104]  Zekai J. Gao,et al.  Declarative Recursive Computation on an RDBMS , 2020, SIGMOD Rec..

[105]  Aruna Raja,et al.  Domain Specific Languages , 2010 .

[106]  Kun Li,et al.  The MADlib Analytics Library or MAD Skills, the SQL , 2012, Proc. VLDB Endow..

[107]  Michael D. Ernst,et al.  HaLoop , 2010 .

[108]  Fernando Sáenz-Pérez,et al.  Formalizing a Broader Recursion Coverage in SQL , 2013, PADL.

[109]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[110]  Chris Jermaine,et al.  Declarative Recursive Computation on an RDBMS, or, Why You Should Use a Database For Distributed Machine Learning , 2019, ArXiv.

[111]  Nils Gesbert,et al.  On the Optimization of Iterative Programming with Distributed Data Collections , 2020 .

[112]  Christos Doulkeridis,et al.  A survey of large-scale analytical query processing in MapReduce , 2013, The VLDB Journal.

[113]  Magdalena Balazinska,et al.  Efficient iterative processing in the SciDB parallel array engine , 2015, SSDBM.

[114]  Geoffrey C. Fox,et al.  Fault-Tolerant Reliable Delivery of Messages in Distributed Publish/Subscribe Systems , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[115]  Michael D. Ernst,et al.  HaLoop , 2010, Proc. VLDB Endow..

[116]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[117]  Boon Thau Loo,et al.  Optimizing Declarative Graph Queries at Large Scale , 2019, SIGMOD Conference.

[118]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[119]  Stephan Günnemann,et al.  SQL- and Operator-centric Data Analytics in Relational Main-Memory Databases , 2017, EDBT.

[120]  Carlo Zaniolo,et al.  RaSQL: Greater Power and Performance for Big Data Analytics with Recursive-aggregate-SQL on Spark , 2019, SIGMOD Conference.

[121]  Jim Melton,et al.  SQL: 1999, formerly known as SQL3 , 1999, SGMD.

[122]  Raul Castro Fernandez,et al.  Making State Explicit for Imperative Big Data Processing , 2014, USENIX Annual Technical Conference.

[123]  Dan Suciu,et al.  The Myria Big Data Management and Analytics System and Cloud Services , 2017, CIDR.

[124]  Michael Isard,et al.  Differential Dataflow , 2013, CIDR.

[125]  Patrick Wendell,et al.  Sparrow: distributed, low latency scheduling , 2013, SOSP.

[126]  Kunle Olukotun,et al.  Language virtualization for heterogeneous parallel computing , 2010, OOPSLA.

[127]  Ding Yuan,et al.  Don't Get Caught in the Cold, Warm-up Your JVM: Understand and Eliminate JVM Warm-up Overhead in Data-Parallel Systems , 2016, OSDI.

[128]  Leonidas Fegaras,et al.  Compile-Time Code Generation for Embedded Data-Intensive Query Languages , 2018, 2018 IEEE International Congress on Big Data (BigData Congress).

[129]  Steven Hand,et al.  CIEL: A Universal Execution Engine for Distributed Data-Flow Computing , 2011, NSDI.

[130]  Reynold Cheng,et al.  Walking in the Cloud: Parallel SimRank at Scale , 2015, Proc. VLDB Endow..

[131]  Volker Markl,et al.  Implicit Parallelism through Deep Language Embedding , 2016, SGMD.

[132]  Philip Levis,et al.  Execution Templates: Caching Control Plane Decisions for Strong Scaling of Data Analytics , 2017, USENIX Annual Technical Conference.

[133]  Chen Xu,et al.  On Fault Tolerance for Distributed Iterative Dataflow Processing , 2017, IEEE Transactions on Knowledge and Data Engineering.

[134]  Michael Isard,et al.  DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.

[135]  Jorge-Arnulfo Quiané-Ruiz,et al.  RHEEMix in the data jungle: a cost-based optimizer for cross-platform systems , 2018, The VLDB Journal.

[136]  Alexander J. Smola,et al.  Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.

[137]  Tamer Elsayed,et al.  iHadoop: Asynchronous Iterations for MapReduce , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.

[138]  Sherif Sakr,et al.  The family of mapreduce and large-scale data processing systems , 2013, CSUR.

[139]  Neil D. Jones,et al.  An introduction to partial evaluation , 1996, CSUR.

[140]  Byung-Gon Chun,et al.  Speculative Symbolic Graph Execution of Imperative Deep Learning Programs , 2019, ACM SIGOPS Oper. Syst. Rev..

[141]  Leonid Ryzhyk,et al.  Differential Datalog , 2019, Datalog.

[142]  Volker Markl,et al.  Distributed Graph Analytics with Datalog Queries in Flink , 2020, SFDI/LSGDA@VLDB.

[143]  Volker Markl,et al.  "All roads lead to Rome": optimistic recovery for distributed iterative data processing , 2013, CIKM.

[144]  Rares Vernica,et al.  Hyracks: A flexible and extensible foundation for data-intensive computing , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[145]  Vítor Santos Costa,et al.  Trebuchet: exploring TLP with dataflow virtualisation , 2011, Int. J. High Perform. Syst. Archit..

[146]  Carlos Guestrin,et al.  Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .