Contender: A Resource Modeling Approach for Concurrent Query Performance Prediction

Predicting query performance under concurrency is a difficult task that has many applications in capacity planning, cloud computing, and batch scheduling. We introduce Contender, a new resourcemodeling approach for predicting the concurrent query performance of analytical workloads. Contender’s unique feature is that it can generate effective predictions for both static as well as adhoc or dynamic workloads with low training requirements. These characteristics make Contender a practical solution for real-world deployment. Contender relies on models of hardware resource contention to predict concurrent query performance. It introduces two key metrics, Concurrent Query Intensity (CQI) and Query Sensitivity (QS), to characterize the impact of resource contention on query interactions. CQI models how aggressively concurrent queries will use the shared resources. QS defines how a query’s performance changes as a function of the scarcity of resources. Contender integrates these two metrics to effectively estimate a query’s concurrent execution latency using only linear time sampling of the query mixes. Contender learns from sample query executions (based on known query templates) and uses query plan characteristics to generate latency estimates for previously unseen templates. Our experimental results, obtained from PostgreSQL/TPC-DS, show that Contender’s predictions have an error of 19% for known templates and 25% for new templates, which is competitive with the state-ofthe-art while requiring considerably less training time.

[1]  Kamesh Munagala,et al.  Modeling and exploiting query interactions in database systems , 2008, CIKM '08.

[2]  Shivnath Babu,et al.  Query interactions in database workloads , 2009, DBTest '09.

[3]  Surajit Chaudhuri,et al.  Estimating progress of execution for SQL queries , 2004, SIGMOD '04.

[4]  Jeffrey F. Naughton,et al.  Toward a progress indicator for database queries , 2004, SIGMOD '04.

[5]  Carlo Curino,et al.  Performance and resource modeling in highly-concurrent OLTP workloads , 2013, SIGMOD '13.

[6]  Chetan Gupta,et al.  PQR: Predicting Query Execution Times for Autonomous Workload Management , 2008, 2008 International Conference on Autonomic Computing.

[7]  Kamesh Munagala,et al.  Interaction-aware scheduling of report-generation workloads , 2011, The VLDB Journal.

[8]  Shivnath Babu,et al.  Predicting completion times of batch query workloads using interaction-aware models and simulation , 2011, EDBT/ICDT '11.

[9]  Archana Ganapathi,et al.  Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[10]  Carlo Curino,et al.  DBSeer: Resource and Performance Prediction for Building a Next Generation Database Cloud , 2013, CIDR.

[11]  Jeffrey F. Naughton,et al.  Towards Predicting Query Execution Time for Concurrent and Dynamic Database Workloads , 2013, Proc. VLDB Endow..

[12]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[13]  Ashraf Aboulnaga,et al.  Deploying Database Appliances in the Cloud , 2009, IEEE Data Eng. Bull..

[14]  Yun Chi,et al.  Packing light: Portable workload performance prediction for the cloud , 2013, 2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW).

[15]  Eli Upfal,et al.  Learning-based Query Performance Modeling and Prediction , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[16]  Alan Jay Smith,et al.  I/O reference behavior of production database workloads and the TPC benchmarks—an analysis at the logical level , 1999, TODS.

[17]  Surajit Chaudhuri,et al.  When can we trust progress estimators for SQL queries? , 2005, SIGMOD '05.

[18]  Jeffrey F. Naughton,et al.  Predicting query execution time: Are optimizer cost models really unusable? , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[19]  Pascal Poupart,et al.  A bayesian approach to online performance modeling for database appliances using gaussian models , 2011, ICAC '11.

[20]  Philip S. Yu,et al.  Multi-query SQL Progress Indicators , 2006, EDBT.

[21]  Shivnath Babu,et al.  Interaction-aware prediction of business intelligence workload completion times , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[22]  Jennie Duggan,et al.  A generic auto-provisioning framework for cloud databases , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[23]  Kamesh Munagala,et al.  QShuffler: Getting the Query Mix Right , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[24]  Meikel Pöss,et al.  TPC-DS, taking decision support benchmarking to the next level , 2002, SIGMOD '02.

[25]  Eli Upfal,et al.  Performance prediction for concurrent database workloads , 2011, SIGMOD '11.

[26]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[27]  Chetan Gupta,et al.  BI batch manager: a system for managing batch workloads on enterprise data-warehouses , 2008, EDBT '08.

[28]  Kurt Hornik,et al.  kernlab - An S4 Package for Kernel Methods in R , 2004 .

[29]  Raghunath Othayoth Nambiar,et al.  Why You Should Run TPC-DS: A Workload Analysis , 2007, VLDB.