Approximation schemes for many-objective query optimization

The goal of multi-objective query optimization (MOQO) is to find query plans that realize a good compromise between conflicting objectives such as minimizing execution time and minimizing monetary fees in a Cloud scenario. A previously proposed exhaustive MOQO algorithm needs hours to optimize even simple TPC-H queries. This is why we propose several approximation schemes for MOQO that generate guaranteed near-optimal plans in seconds where exhaustive optimization takes hours. We integrated all MOQO algorithms into the Postgres optimizer and present experimental results for TPC-H queries; we extended the Postgres cost model and optimize for up to nine conflicting objectives in our experiments. The proposed algorithms are based on a formal analysis of typical cost functions that occur in the context of MOQO. We identify properties that hold for a broad range of objectives and can be exploited for the design of future MOQO algorithms.

[1]  Donald Kossmann,et al.  Iterative dynamic programming: a new class of query optimization algorithms , 2000, TODS.

[2]  Ion Stoica,et al.  Blink and It's Done: Interactive Queries on Very Large Data , 2012, Proc. VLDB Endow..

[3]  Minos N. Garofalakis,et al.  Multi-dimensional resource scheduling for parallel queries , 1996, SIGMOD '96.

[4]  Timos K. Sellis,et al.  State-space optimization of ETL workflows , 2005, IEEE Transactions on Knowledge and Data Engineering.

[5]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[6]  Dimitrios Gunopulos,et al.  Efficient Approximation Of Optimization Queries Under Parametric Aggregation Constraints , 2003, VLDB.

[7]  Radu Marinescu Efficient Approximation Algorithms for Multi-objective Constraint Optimization , 2011, ADT.

[8]  David J. DeWitt,et al.  Progressive Parametric Query Optimization , 2009, IEEE Transactions on Knowledge and Data Engineering.

[9]  W. Marsden I and J , 2012 .

[10]  Hans Kellerer,et al.  Approximating Multiobjective Knapsack Problems , 2002, Manag. Sci..

[11]  Timos K. Sellis,et al.  Parametric query optimization , 1992, The VLDB Journal.

[12]  Sumit Ganguly,et al.  Design and Analysis of Parametric Query Optimization Algorithms , 1998, VLDB.

[13]  Gautam Jain Query Optimization for Parallel Execution , 2007 .

[14]  Kevin Wilkinson,et al.  Optimizing analytic data flows for multiple execution engines , 2012, SIGMOD Conference.

[15]  Daniel C. Zilio,et al.  Alternative Query Optimization for Workload Management , 2012, DEXA.

[16]  Tobias Flach Optimizing query execution to improve the energy efficiency of database management systems , 2010 .

[17]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[18]  David J. DeWitt,et al.  Proactive re-optimization , 2005, SIGMOD '05.

[19]  Xiaorui Wang,et al.  PET: Reducing Database Energy Cost via Query Optimization , 2012, Proc. VLDB Endow..

[20]  Sumit Ganguly,et al.  On the complexity of approximate query optimization , 2002, PODS '02.

[21]  Mihalis Yannakakis,et al.  Multiobjective query optimization , 2001, PODS '01.

[22]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .

[23]  Joseph Y. Halpern,et al.  Least expected cost query optimization: what can we expect? , 2002, PODS.

[24]  Yannis E. Ioannidis,et al.  Schedule optimization for data processing flows on the cloud , 2011, SIGMOD '11.

[25]  P. Haas Speeding up DB 2 UDB Using Sampling , 2003 .

[26]  E. LESTER SMITH,et al.  AND OTHERS , 2005 .

[27]  Surajit Chaudhuri,et al.  Towards a robust query optimizer: a principled and practical approach , 2005, SIGMOD '05.

[28]  David Maier,et al.  Rapid bushy join-order optimization with Cartesian products , 1996, SIGMOD '96.