Supporting Time-Constrained SQL Queries in Oracle

The growing nature of databases, and the flexibility inherent in the SQL query language that allows arbitrarily complex formulations, can result in queries that take inordinate amount of time to complete. To mitigate this problem, strategies that are optimized to return the 'first-few rows' or 'top-k rows' (in case of sorted results) are usually employed. However, both these strategies can lead to unpredictable query processing times. Thus, in this paper we propose supporting time-constrained SQL queries. Specifically, a user issues a SQL query as before but additionally provides nature of constraint (soft or hard), an upper bound for query processing time, and acceptable nature of results (partial or approximate). The DBMS takes the criteria (constraint type, time limit, quality of result) into account in generating the query execution plan, which is expected (guaranteed) to complete in the allocated time for soft (hard) time constraint. If partial results are acceptable then the technique of reducing result set cardinality (i.e. returning first few or top-k rows) is used, whereas if approximate results are acceptable then sampling is used, to compute query results within the specified time limit. For the latter case, we argue that trading off quality of results for predictable response time is quite useful. However, for this case, we provide additional aggregate functions to estimate the aggregate values and to compute the associated confidence interval. This paper presents the notion of time-constrained SQL queries, discusses the challenges in supporting such a construct, describes a framework for supporting such queries, and outlines its implementation in Oracle Database by exploiting Oracle's cost-based optimizer and extensibility capabilities.

[1]  Viswanath Poosala,et al.  Congressional samples for approximate answering of group-by queries , 2000, SIGMOD '00.

[2]  B GibbonsPhillip,et al.  Join synopses for approximate query answering , 1999 .

[3]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[4]  Peter J. Haas,et al.  Large-sample and deterministic confidence intervals for online aggregation , 1997, Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150).

[5]  Surajit Chaudhuri,et al.  Dynamic sample selection for approximate query processing , 2003, SIGMOD '03.

[6]  Surajit Chaudhuri,et al.  When can we trust progress estimators for SQL queries? , 2005, SIGMOD '05.

[7]  Michael J. Carey,et al.  On saying “Enough already!” in SQL , 1997, SIGMOD '97.

[8]  Walid G. Aref,et al.  Rank-aware query optimization , 2004, SIGMOD '04.

[9]  Luis Gravano,et al.  Evaluating Top-k Selection Queries , 1999, VLDB.

[10]  OzsoyogluGultekin,et al.  Processing aggregate relational queries with hard time constraints , 1989 .

[11]  Sang Hyuk Son,et al.  Real-Time Databases and Data Services , 2004, Real-Time Systems.

[12]  Michael J. Carey,et al.  Reducing the Braking Distance of an SQL Query Engine , 1998, VLDB.

[13]  Jeffrey F. Naughton,et al.  Increasing the accuracy and coverage of SQL progress indicators , 2005, 21st International Conference on Data Engineering (ICDE'05).

[14]  Hong Su,et al.  Cost-based query transformation in Oracle , 2006, VLDB.

[15]  Peter J. Haas,et al.  Hoeffding inequalities for join-selectivity estimation and online aggregation , 1996 .

[16]  Joseph M. Hellerstein,et al.  Informix under CONTROL: Online Query Processing , 2000, Data Mining and Knowledge Discovery.

[17]  Surajit Chaudhuri,et al.  Estimating progress of execution for SQL queries , 2004, SIGMOD '04.

[18]  Raghu Ramakrishnan,et al.  Probabilistic Optimization of Top N Queries , 1999, VLDB.

[19]  Jeffrey F. Naughton,et al.  Toward a progress indicator for database queries , 2004, SIGMOD '04.

[20]  Wen-Chi Hou,et al.  Time-Constrained Query Processing in CASE-DB , 1995, IEEE Trans. Knowl. Data Eng..

[21]  Sridhar Ramaswamy,et al.  Join synopses for approximate query answering , 1999, SIGMOD '99.

[22]  Jayant R. Haritsa,et al.  Real-Time Database Systems in the New Millenium , 2004, Real-Time Systems.

[23]  Rajeev Motwani,et al.  On random sampling over joins , 1999, SIGMOD '99.

[24]  Eugene Inseok Chong,et al.  An Efficient SQL-based RDF Querying Scheme , 2005, VLDB.

[25]  Surajit Chaudhuri,et al.  Estimating Progress of Long Running SQL Queries , 2004, SIGMOD Conference.