Scalable Query Evaluation in Relational Databases

The scalability of a query depends on the amount of data that needs to be accessed when computing the answer. This implies three immediate general strategies for improving query performance: decrease the amount of data (including intermediate results) to be accessed by accessing it smarter; decrease the amount by simply reducing the data quantity in the first place; and increase the amount of data accessed per time unit. This PhD dissertation presents four research results, covering each of these three approaches. The first three results focus on variations of the highly applicable query class join-project, which is a join of two database tables followed by a duplicate eliminating projection. Join-projects are equivalent to sparse Boolean matrix multiplication and frequent pair mining (the special case of frequent itemset with itemset cardinality limited to 2). We describe a new output sensitive algorithm for join-projects which has small intermediate results on worst-case inputs, and in particular, is efficient in both the RAM and I/O model. The algorithm uses the output size to deduce its computation strategy, and this introduces a chicken-andegg problem: how do we obtain the output size without actually computing the output? This question is answered in another result in which we obtain a (1± e) approximation of the output size in expected linear time and I/O for e > 1/ 4 √ n. In another result we address the throughput itself by using the massive parallel capabilities of graphics processing units (GPUs) to handle the pair mining problem. For that we present a new data structure, BatMap, which is a novel vertical data layout that is particularly well suited for parallel processing. The last result deals with the general problem of reducing the quantity of data that must be accessed for answering any given query on a row store RDBMS. We present a quadratic integer program formulation of the vertical partitioning problem for OLTP workloads in a distributed environment. This quadratic optimization problem is NP-hard so we also describe a randomized heuristic that empirically has shown to be reliable in sense of both speed and cost reduction.

[1]  Mihail N. Kolountzakis,et al.  Efficient Triangle Counting in Large Graphs via Degree-Based Vertex Partitioning , 2010, Internet Math..

[2]  Hiroki Arimura,et al.  LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets , 2004, FIMI.

[3]  Michael Stonebraker,et al.  H-store: a high-performance, distributed main memory transaction processing system , 2008, Proc. VLDB Endow..

[4]  Lars Schmidt-Thieme,et al.  On benchmarking frequent itemset mining algorithms: from measurement to analysis , 2005 .

[5]  Amit Kumar,et al.  Join-distinct aggregate estimation over update streams , 2005, PODS '05.

[6]  Dan E. Willard,et al.  Quasilinear algorithms for processing relational calculus expressions (preliminary report) , 1990, PODS '90.

[7]  Don Coppersmith,et al.  On the Asymptotic Complexity of Matrix Multiplication , 1982, SIAM J. Comput..

[8]  Edith Cohen,et al.  Structure Prediction and Computation of Sparse Matrix Products , 1998, J. Comb. Optim..

[9]  Anna Pagh,et al.  Scalable computation of acyclic joins , 2006, PODS '06.

[10]  Phillip B. Gibbons Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports , 2001, VLDB.

[11]  Don Coppersmith,et al.  Rectangular Matrix Multiplication Revisited , 1997, J. Complex..

[12]  Arnold Schönhage,et al.  Partial and Total Matrix Multiplication , 1981, SIAM J. Comput..

[13]  V. V. Klyuev,et al.  Minimization of the number of arithmetic operations in the solution of linear algebraic systems of equations , 1965 .

[14]  G. Nemhauser,et al.  Integer Programming , 2020 .

[15]  Rasmus Pagh,et al.  Better Size Estimation for Sparse Matrix Products , 2010, Algorithmica.

[16]  Edith Cohen,et al.  Size-Estimation Framework with Applications to Transitive Closure and Reachability , 1997, J. Comput. Syst. Sci..

[17]  Dan E. Willard,et al.  Efficient processing of relational calculus expressions using range query theory , 1984, SIGMOD '84.

[18]  Shamkant B. Navathe,et al.  Vertical partitioning algorithms for database design , 1984, TODS.

[19]  Dennis G. Severance,et al.  Mathematical Techniques for Efficient Record Segmentation in Large Shared Databases , 1976, JACM.

[20]  Man-Tak Shing,et al.  Computation of Matrix Chain Products. Part II , 1984, SIAM J. Comput..

[21]  Noga Alon,et al.  Finding and counting given length cycles , 1997, Algorithmica.

[22]  Uri Zwick,et al.  Selecting the median , 1995, SODA '95.

[23]  Bingsheng He,et al.  Frequent itemset mining on graphics processors , 2009, DaMoN '09.

[24]  Fred G. Gustavson,et al.  Two Fast Algorithms for Sparse Matrices: Multiplication and Permuted Transposition , 1978, TOMS.

[25]  C. A. R. Hoare,et al.  Algorithm 65: find , 1961, Commun. ACM.

[26]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[27]  Luca Trevisan,et al.  Counting Distinct Elements in a Data Stream , 2002, RANDOM.

[28]  Francis Y. L. Chin,et al.  An O(n) algorithm for determining a near-optimal computation order of matrix chain products , 1978, CACM.

[29]  Christian Borgelt,et al.  EFFICIENT IMPLEMENTATIONS OF APRIORI AND ECLAT , 2003 .

[30]  Alok Aggarwal,et al.  The input/output complexity of sorting and related problems , 1988, CACM.

[31]  Marcin Zukowski,et al.  MonetDB/X100: Hyper-Pipelining Query Execution , 2005, CIDR.

[32]  M. Okoniewski,et al.  Acceleration of large-scale FDTD simulations on high performance GPU clusters , 2009, 2009 IEEE Antennas and Propagation Society International Symposium.

[33]  Sridhar Ramaswamy,et al.  Join synopses for approximate query answering , 1999, SIGMOD '99.

[34]  V. Strassen Gaussian elimination is not optimal , 1969 .

[35]  Syam Menon,et al.  Allocating fragments in distributed databases , 2005, IEEE Transactions on Parallel and Distributed Systems.

[36]  Shamkant B. Navathe,et al.  Vertical partitioning for database design: a graphical algorithm , 1989, SIGMOD '89.

[37]  Mohammed J. Zaki Parallel and distributed association mining: a survey , 1999, IEEE Concurr..

[38]  Weiguo Liu,et al.  Performance Predictions for General-Purpose Computation on GPUs , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).

[39]  Rasmus Pagh,et al.  A New Data Layout for Set Intersection on GPUs , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[40]  Philip Bille,et al.  Fast Evaluation of Union-Intersection Expressions , 2007, ISAAC.

[41]  Friedhelm Meyer auf der Heide,et al.  Simple, efficient shared memory simulations , 1993, SPAA '93.

[42]  Theodore Johnson,et al.  Performance Measurements of Compressed Bitmap Indices , 1999, VLDB.

[43]  Eli Upfal,et al.  How to share memory in a distributed system , 1984, JACM.

[44]  Wesley W. Chu,et al.  A Transaction-Based Approach to Vertical Partitioning for Relational Database Systems , 1993, IEEE Trans. Software Eng..

[45]  Erik D. Demaine,et al.  Adaptive set intersections, unions, and differences , 2000, SODA '00.

[46]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[47]  Myoung-Ho Kim,et al.  An adaptable vertical partitioning method in distributed systems , 2004, J. Syst. Softw..

[48]  Rasmus Resen Amossen Vertical partitioning of relational OLTP databases using integer programming , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[49]  Arie E. Kaufman,et al.  GPU Cluster for High Performance Computing , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[50]  David K. Hsiao,et al.  Proceedings of the 1st International Conference on Very Large Data Bases , 1975, VLDB 1975.

[51]  Chia-Chu Chiang,et al.  A Parallel Apriori Algorithm for Frequent Itemsets Mining , 2006, Fourth International Conference on Software Engineering Research, Management and Applications (SERA'06).

[52]  V. Pan New combinations of methods for the acceleration of matrix multiplications , 1981 .

[53]  Desh Ranjan,et al.  Balls and bins: A study in negative dependence , 1996, Random Struct. Algorithms.

[54]  Russ Bubley,et al.  Randomized algorithms , 1995, CSUR.

[55]  Bala Shetty,et al.  A constrained nonlinear 0–1 program for data allocation , 1997 .

[56]  Mihail N. Kolountzakis,et al.  Efficient Triangle Counting in Large Graphs via Degree-Based Vertex Partitioning , 2012, Internet Math..

[57]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[58]  Arie Shoshani,et al.  An efficient compression scheme for bitmap indices , 2004 .

[59]  Marcin Zukowski,et al.  Super-Scalar RAM-CPU Cache Compression , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[60]  Christian Borgelt Recursion Pruning for the Apriori Algorithm , 2004, FIMI.

[61]  Uzi Vishkin,et al.  Simulation of Parallel Random Access Machines by Circuits , 1984, SIAM J. Comput..

[62]  Victor Y. Pan,et al.  Fast Rectangular Matrix Multiplication and Applications , 1998, J. Complex..

[63]  V. Pan How can we speed up matrix multiplication , 1984 .

[64]  Yves Crama,et al.  Boolean methods in operations research and related areas , 2011 .

[65]  Shamkant B. Navathe,et al.  An objective function for vertically partitioning relations in distributed databases and its analysis , 2005, Distributed and Parallel Databases.

[66]  Bingsheng He,et al.  Relational joins on graphics processors , 2008, SIGMOD Conference.

[67]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[68]  Domenico Saccà,et al.  Database partitioning in a cluster of processors , 1983, TODS.

[69]  Raphael Yuster,et al.  Fast sparse matrix multiplication , 2004, TALG.

[70]  Sadashiva S. Godbole,et al.  On Efficient Computation of Matrix Chain Products , 1973, IEEE Transactions on Computers.

[71]  Eric Li,et al.  Optimization of Frequent Itemset Mining on Multiple-Core Processor , 2007, VLDB.

[72]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[73]  Mihalis Yannakakis,et al.  Algorithms for Acyclic Database Schemes , 1981, VLDB.

[74]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[75]  Charu C. Aggarwal,et al.  A Tree Projection Algorithm for Generation of Frequent Item Sets , 2001, J. Parallel Distributed Comput..

[76]  Shmuel Winograd,et al.  A New Algorithm for Inner Product , 1968, IEEE Transactions on Computers.

[77]  Sakti Pramanik,et al.  Optimizing Join Queries in Distributed Databases , 1988, IEEE Trans. Software Eng..

[78]  Vitaly Osipov,et al.  GPU sample sort , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[79]  Dorothea Wagner,et al.  Finding, Counting and Listing All Triangles in Large Graphs, an Experimental Study , 2005, WEA.

[80]  Vivek R. Narasayya,et al.  Integrating vertical and horizontal partitioning into automated physical database design , 2004, SIGMOD '04.

[81]  Dinesh Manocha,et al.  GPUTeraSort: high performance graphics co-processor sorting for large database management , 2006, SIGMOD Conference.

[82]  Rajeev Motwani,et al.  Towards estimation error guarantees for distinct values , 2000, PODS.

[83]  Paul J. Schweitzer,et al.  Problem Decomposition and Data Reorganization by a Clustering Technique , 1972, Oper. Res..

[84]  Christian Borgelt,et al.  An implementation of the FP-growth algorithm , 2005 .

[85]  Sumit Ganguly,et al.  On Estimating Path Aggregates over Streaming Graphs , 2006, ISAAC.

[86]  Grazia Lotti,et al.  O(n2.7799) Complexity for n*n Approximate Matrix Multiplication , 1979, Inf. Process. Lett..

[87]  George Epstein,et al.  Comments on "The Relationship Between Multivalued Switching Algebra and Boolean Algebra Under Different Definitions of Complement" , 1973, IEEE Trans. Computers.

[88]  T. C. Hu,et al.  Computation of Matrix Chain Products. Part I , 1982, SIAM J. Comput..

[89]  Victor Y. Pan,et al.  Strassen's algorithm is not optimal trilinear technique of aggregating, uniting and canceling for constructing fast algorithms for matrix operations , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[90]  Philip S. Yu,et al.  An Effective Approach to Vertical Partitioning for Physical Design of Relational Databases , 1990, IEEE Trans. Software Eng..

[91]  Srinivasan Parthasarathy,et al.  Cache-conscious Frequent Pattern Mining on a Modern Processor , 2005, VLDB.

[92]  Michael Stonebraker,et al.  The End of an Architectural Era (It's Time for a Complete Rewrite) , 2007, VLDB.

[93]  Balázs Rácz,et al.  nonordfp: An FP-growth variation without rebuilding the FP-tree , 2004, FIMI.

[94]  V. V. Williams,et al.  Triangle Detection Versus Matrix Multiplication : A Study of Truly Subcubic Reducibility ∗ , 2009 .

[95]  Rasmus Pagh,et al.  Faster join-projects and sparse matrix multiplications , 2009, ICDT '09.

[96]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[97]  Sakti Pramanik,et al.  Optimizing Join Queries in Distributed Database , 1987, FSTTCS.

[98]  Andrzej Lingas,et al.  A Fast Output-Sensitive Algorithm for Boolean Matrix Multiplication , 2011, Algorithmica.