RIOS: Runtime Integrated Optimizer for Spark

Many Data-Intensive Scalable Computing (DISC) systems do not support sophisticated cost-based query optimizers because they lack the necessary data statistics. Consequently many crucial optimizations, such as join order and plan selection, are not well supported in DISC systems. RIOS is a Runtime Integrated Optimizer for Spark that lazily binds to execution plans at runtime, after collecting the statistics needed to make more optimal decisions. We evaluate the efficacy of our approach and show that better plans can be derived at runtime, achieving more than an order-of-magnitude performance improvement compared to compile time generated plans produced by the Apache Spark rule-base optimizer.

[1]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[2]  David J. DeWitt,et al.  Efficient mid-query re-optimization of sub-optimal query execution plans , 1998, SIGMOD '98.

[3]  James K. Mullin,et al.  Optimal Semijoins for Distributed Database Systems , 1990, IEEE Trans. Software Eng..

[4]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[5]  Nicolas Bruno,et al.  Continuous Cloud-Scale Query Optimization and Processing , 2013, Proc. VLDB Endow..

[6]  Surajit Chaudhuri,et al.  Optimized stratified sampling for approximate query processing , 2007, TODS.

[7]  David J. DeWitt,et al.  Proactive re-optimization , 2005, SIGMOD '05.

[8]  Joseph K. Bradley,et al.  Spark SQL: Relational Data Processing in Spark , 2015, SIGMOD Conference.

[9]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[10]  Chengyang Zhang,et al.  Dynamic Statistics Collection in the Teradata Unified Data Architecture , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[11]  Philip A. Bernstein,et al.  Using Semi-Joins to Solve Relational Queries , 1981, JACM.

[12]  Goetz Graefe,et al.  The Volcano optimizer generator: extensibility and efficient search , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[13]  Jingren Zhou,et al.  SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..

[14]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[15]  Irving L. Traiger,et al.  A history and evaluation of System R , 1981, CACM.

[16]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[17]  Wolfgang Nejdl,et al.  Cardinality estimation and dynamic length adaptation for Bloom filters , 2010, Distributed and Parallel Databases.

[18]  Alfons Kemper,et al.  Flow-Join: Adaptive skew handling for distributed joins over high-speed networks , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[19]  Michael Isard,et al.  Optimus: a dynamic rewriting framework for data-parallel execution plans , 2013, EuroSys '13.

[20]  David J. DeWitt,et al.  The EXODUS optimizer generator , 1987, SIGMOD '87.

[21]  Hamid Pirahesh,et al.  Extensible query processing in starburst , 1989, SIGMOD '89.

[22]  Divyakant Agrawal,et al.  Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.

[23]  Srikanth Kandula,et al.  Quickr: Lazily Approximating Complex AdHoc Queries in BigData Clusters , 2016, SIGMOD Conference.

[24]  Goetz Graefe The Cascades Framework for Query Optimization , 1995, IEEE Data Eng. Bull..

[25]  J. S. Saini,et al.  Adaptive Query Processing , 2006 .

[26]  Andrey Balmin,et al.  Dynamically optimizing queries over large scale data platforms , 2014, SIGMOD Conference.

[27]  Graham Cormode,et al.  Sketching Streams Through the Net: Distributed Approximate Query Tracking , 2005, VLDB.

[28]  NejdlWolfgang,et al.  Cardinality estimation and dynamic length adaptation for Bloom filters , 2010 .

[29]  Irving L. Traiger,et al.  System R: relational approach to database management , 1976, TODS.

[30]  Pete Wyckoff,et al.  Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..

[31]  P. Flajolet,et al.  HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm , 2007 .

[32]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[33]  Rares Vernica,et al.  Hyracks: A flexible and extensible foundation for data-intensive computing , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[34]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[35]  Joseph M. Hellerstein,et al.  Eddies: continuously adaptive query processing , 2000, SIGMOD '00.

[36]  Srikanth Kandula,et al.  Reoptimizing Data Parallel Computing , 2012, NSDI.

[37]  Dan Suciu,et al.  Demonstration of the Myria big data management service , 2014, SIGMOD Conference.

[38]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[39]  Seif Haridi,et al.  Apache Flink™: Stream and Batch Processing in a Single Engine , 2015, IEEE Data Eng. Bull..

[40]  Nicolas Bruno,et al.  SCOPE: parallel databases meet MapReduce , 2012, The VLDB Journal.

[41]  Noga Alon,et al.  Tracking join and self-join sizes in limited storage , 1999, PODS '99.

[42]  Volker Markl,et al.  LEO - DB2's LEarning Optimizer , 2001, VLDB.

[43]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[44]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .