From Theory to Practice: Efficient Join Query Evaluation in a Parallel Database System

Big data analytics often requires processing complex queries using massive parallelism, where the main performance metrics is the communication cost incurred during data reshuffling. In this paper, we describe a system that can compute efficiently complex join queries, including queries with cyclic joins, on a massively parallel architecture. We build on two independent lines of work for multi-join query evaluation: a communication-optimal algorithm for distributed evaluation, and a worst-case optimal algorithm for sequential evaluation. We evaluate these algorithms together, then describe novel, practical optimizations for both algorithms.

[1]  Mihalis Yannakakis,et al.  Algorithms for Acyclic Database Schemes , 1981, VLDB.

[2]  David J. DeWitt,et al.  Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines , 1990, VLDB.

[3]  Stavros Christodoulakis,et al.  On the propagation of errors in the size of join results , 1991, SIGMOD '91.

[4]  Hongjun Lu,et al.  Optimization of Multi-Way Join Queries for Parallel Execution , 1991, VLDB.

[5]  Abraham Silberschatz,et al.  Parallel Bottom-Up Processing of Datalog Queries , 1992, J. Log. Program..

[6]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[7]  Joseph M. Hellerstein,et al.  Eddies: continuously adaptive query processing , 2000, SIGMOD '00.

[8]  Erhard Rahm,et al.  Multi-Dimensional Database Allocation for Parallel Data Warehouses , 2000, VLDB.

[9]  Chun Zhang,et al.  Automating physical database design in a parallel database , 2002, SIGMOD '02.

[10]  Dániel Marx,et al.  Size Bounds and Query Plans for Relational Joins , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[11]  Abraham Silberschatz,et al.  HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads , 2009, Proc. VLDB Endow..

[12]  Jeffrey D. Ullman,et al.  Optimizing joins in a map-reduce environment , 2010, EDBT '10.

[13]  Zheng Shao,et al.  Hive - a petabyte scale data warehouse using Hadoop , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[14]  Andrey Gubarev,et al.  Dremel : Interactive Analysis of Web-Scale Datasets , 2011 .

[15]  Dan Suciu,et al.  Parallel evaluation of conjunctive queries , 2011, PODS.

[16]  Nicolas Bruno,et al.  Automated partitioning design in parallel database systems , 2011, SIGMOD '11.

[17]  C. Ré,et al.  Worst-case optimal join algorithms: [extended abstract] , 2012, PODS '12.

[18]  Min Wang,et al.  Efficient Multi-way Theta-Join Processing Using MapReduce , 2012, Proc. VLDB Endow..

[19]  Yongli Zhu,et al.  Cache conscious star-join in MapReduce environments , 2013, Cloud-I '13.

[20]  Ian Rae,et al.  F1: A Distributed SQL Database That Scales , 2013, Proc. VLDB Endow..

[21]  Paraschos Koutris,et al.  Communication steps for parallel query processing , 2013, PODS '13.

[22]  Laurent d'Orazio,et al.  Toward intersection filter-based optimization for joins in MapReduce , 2013, Cloud-I '13.

[23]  Scott Shenker,et al.  Shark: SQL and rich analytics at scale , 2012, SIGMOD '13.

[24]  Dan Suciu,et al.  Demonstration of the Myria big data management service , 2014, SIGMOD Conference.

[25]  Nicolas Bruno,et al.  Advanced Join Strategies for Large-Scale Distributed Computation , 2014, Proc. VLDB Endow..

[26]  Christoph Koch,et al.  Scalable and Adaptive Online Joins , 2014, Proc. VLDB Endow..

[27]  Christopher Ré,et al.  GYM: A Multiround Join Algorithm In MapReduce , 2014, ArXiv.

[28]  Zoran Levnajic,et al.  Revealing the Hidden Language of Complex Networks , 2014, Scientific Reports.

[29]  Atri Rudra,et al.  Skew strikes back: new developments in the theory of join algorithms , 2013, SGMD.

[30]  Atri Rudra,et al.  Beyond worst-case analysis for joins with minesweeper , 2014, PODS.

[31]  Rui Liu,et al.  Execution Primitives for Scalable Joins and Aggregations in Map Reduce , 2014, Proc. VLDB Endow..

[32]  Todd L. Veldhuizen,et al.  Leapfrog Triejoin: A Simple, Worst-Case Optimal Join Algorithm , 2012, 1210.0481.

[33]  Kenneth A. Ross,et al.  Track join: distributed joins with minimal network traffic , 2014, SIGMOD Conference.

[34]  Dan Suciu,et al.  Skew in parallel query processing , 2014, PODS.

[35]  Jeffrey Heer,et al.  Perfopticon: Visual Query Analysis for Distributed Databases , 2015, Comput. Graph. Forum.