论文信息 - Automated Performance Bug Detection in Database Systems

Automated Performance Bug Detection in Database Systems

Because database systems are the critical component of modern data-intensive applications, it is important to ensure that they operate correctly. To this end, developers extensively test these systems to eliminate bugs that negatively affect functionality. In addition to functional bugs, however, there is another important class of bugs: performance bugs. These bugs negatively affect the response time of a database system and can therefore affect the overall performance of the system. Despite their impact on end-user experience, performance bugs have received considerably less attention than functional bugs. In this paper, we present AMOEBA, a system for automatically detecting performance bugs in database systems. The core idea behind AMOEBA is to construct query pairs that are semantically equivalent to each other and then compare their response time on the same database system. If the queries exhibit a significant difference in their runtime performance, then the root cause is likely a performance bug in the system. We propose a novel set of structure and predicate mutation rules for constructing query pairs that are likely to uncover performance bugs. We introduce feedback mechanisms for improving the efficacy and computational efficiency of the tool. We evaluate AMOEBA on two widely-used DBMSs, namely PostgreSQL and CockroachDB. AMOEBA has discovered 20 previously-unknown performance bugs, among which developers have already confirmed 14 and fixed 4.

[1] W. M. McKeeman,et al. Differential Testing for Software , 1998, Digit. Tech. J..

[2] Viktor Leis,et al. How Good Are Query Optimizers, Really? , 2015, Proc. VLDB Endow..

[3] Alan Fekete,et al. A teaching system for SQL , 1997, ACSE '97.

[4] Hicham G. Elmongui,et al. A framework for testing query transformation rules , 2009, SIGMOD Conference.

[5] Hamid Pirahesh,et al. Extensible/rule based query rewrite optimization in Starburst , 1992, SIGMOD '92.

[6] Dong Hun Lee,et al. Performance Monitoring in SAP HANA's Continuous Integration Process , 2016, PERV.

[7] Taesoo Kim,et al. APOLLO: Automatic Detection and Diagnosis of Performance Regressions in Database Systems , 2019, Proc. VLDB Endow..

[8] Benoît Dageville,et al. Oracle's SQL Performance Analyzer , 2008, IEEE Data Eng. Bull..

[9] Leo Giakoumakis,et al. A genetic approach for random testing of database systems , 2007, VLDB.

[10] Tsong Yueh Chen,et al. Metamorphic Testing: A New Approach for Generating Next Test Cases , 2020, ArXiv.

[11] William J. Knottenbelt,et al. Database system performance evaluation models: A survey , 2012, Perform. Evaluation.

[12] Zhendong Su,et al. Finding bugs in database systems via query partitioning , 2020, Proc. ACM Program. Lang..

[13] César A. Galindo-Legaria,et al. Counting, enumerating, and sampling of execution plans in a cost-based query optimizer , 2000, SIGMOD '00.

[14] Zhendong Su,et al. Testing Database Engines via Pivoted Query Synthesis , 2020, OSDI.

[15] C. L. Philip Chen,et al. Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[16] Shrainik Jain,et al. Snowtrail: Testing with Production Queries on a Cloud Database , 2018, DBTest@SIGMOD.

[17] Donald R. Slutz,et al. Massive Stochastic Testing of SQL , 1998, VLDB.

[18] Zhan Li,et al. OptMark: A Toolkit for Benchmarking Query Optimizers , 2016, CIKM.

[19] Hitesh Kumar Sharma,et al. Explain Plan and SQL Trace the Two Approaches for RDBMS Tuning , 2017 .

[20] Bikash Chandra,et al. Data generation for testing and grading SQL queries , 2015, The VLDB Journal.

[21] Jaroslav Pokorný,et al. Database technologies in the world of big data , 2015, CompSysTech '15.

[22] Donghun Lee,et al. A Performance Anomaly Detection and Analysis Framework for DBMS Development , 2012, IEEE Transactions on Knowledge and Data Engineering.

[23] Zhendong Su,et al. Detecting optimization bugs in database engines via non-optimizing reference engine construction , 2020, ESEC/SIGSOFT FSE.

[24] César A. Galindo-Legaria,et al. Testing SQL Server's Query Optimizer: Challenges, Techniques and Experiences , 2008, IEEE Data Eng. Bull..

[25] Raghu Ramakrishnan,et al. Database Management Systems , 1976 .

[26] William R. Harris,et al. Automated Verification of Query Equivalence Using Satisfiability Modulo Theories , 2019, Proc. VLDB Endow..

[27] Toshihide Ibaraki,et al. On the optimal nesting order for computing N-relational joins , 1984, TODS.

[28] Surajit Chaudhuri,et al. Generating Queries with Cardinality Constraints for DBMS Testing , 2006, IEEE Transactions on Knowledge and Data Engineering.

[29] Alvin Cheung,et al. Cosette: An Automated Prover for SQL , 2017, CIDR.

[30] S. Sudarshan,et al. Generating test data for killing SQL mutants: A constraint-based approach , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[31] Goetz Graefe,et al. Query evaluation techniques for large databases , 1993, CSUR.

[32] Mohamed A. Soliman,et al. Testing the accuracy of query optimizers , 2012, DBTest '12.

[33] Gerhard Weikum,et al. Self-tuning Database Technology and Information Services: from Wishful Thinking to Viable Engineering , 2002, VLDB.

[34] Dinghao Wu,et al. SQUIRREL: Testing Database Management Systems with Language Validity and Coverage Feedback , 2020, CCS.