Verification supported refactoring of embedded sql

Improving code quality without changing its functionality, e.g., by refactoring or optimization, is an everyday programming activity. Good programming practice requires that each such change should be followed by a check if the change really preserves the code behavior. If such a check is performed by testing, it can be time consuming and still cannot guarantee the absence of differences in behavior between two versions of the code. Hence, tools that could automatically verify code equivalence would be of great help. An area that we are focused on is embedded sql programming. There are a number of approaches for dealing with equivalence of either pairs of imperative code fragments or pairs of sql statements. However, in database-driven applications, simultaneous changes (changes that include both sql and a host language code) are also present and important. Such changes can preserve the overall equivalence without preserving equivalence of these two parts considered separately. In this paper, we propose an automated approach for dealing with equivalence of programs after such changes, a problem that is hardly tackled in literature. Our approach uses our custom first-order logic modeling of sql queries that corresponds to imperative semantics. The approach generates equivalence conditions that can be efficiently checked using smt solvers or first-order logic provers. We implemented the proposed approach as a framework sqlav, which is publicly available and open source.

[1]  Lauretta O. Osho,et al.  Axiomatic Basis for Computer Programming , 2013 .

[2]  Stephan Diehl,et al.  Identifying Refactorings from Source-Code Changes , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[3]  Isil Dillig,et al.  Verifying equivalence of database-driven applications , 2017, Proc. ACM Program. Lang..

[4]  Werner Nutt,et al.  Rewriting aggregate queries using views , 1999, PODS.

[5]  Tim Gorman,et al.  Beginning Oracle SQL: for Oracle Database 12c , 2014 .

[6]  Vladimir Klebanov,et al.  Automating regression verification , 2014, Software Engineering & Management.

[7]  Andrei Voronkov,et al.  The design and implementation of VAMPIRE , 2002, AI Commun..

[8]  Regina Obe,et al.  PostgreSQL - Up and Running: a Practical Guide to the Advanced Open Source Database , 2012 .

[9]  Alvin Cheung,et al.  HoTTSQL: proving query rewrites with univalent SQL semantics , 2016, PLDI.

[10]  Sara Cohen,et al.  Equivalence of queries combining set and bag-set semantics , 2006, PODS '06.

[11]  Yves Bertot,et al.  Interactive Theorem Proving and Program Development: Coq'Art The Calculus of Inductive Constructions , 2010 .

[12]  Alan J. Hu,et al.  Calysto: scalable and precise extended static checking , 2008, ICSE.

[13]  Suzette Person,et al.  Regression Verification Using Impact Summaries , 2013, SPIN.

[14]  Ashok K. Chandra,et al.  Optimal implementation of conjunctive queries in relational data bases , 1977, STOC '77.

[15]  Carsten Görg,et al.  Error detection by refactoring reconstruction , 2005, MSR '05.

[16]  Mauricio A. Saca Refactoring improving the design of existing code , 2017, 2017 IEEE 37th Central America and Panama Convention (CONCAPAN XXXVII).

[17]  Alan Beaulieu,et al.  Learning SQL , 2005 .

[18]  Gerda Janssens,et al.  Experience with Widening Based Equivalence Checking in Realistic Multimedia Systems , 2009, 2009 IEEE International High Level Design Validation and Test Workshop.

[19]  Peter Rob,et al.  Database systems - design, implementation, and management (2. ed.) , 1995 .

[20]  Christoph Scheben,et al.  Efficient Self-composition for Weakest Precondition Calculi , 2014, FM.

[21]  Shi-Yu Huang,et al.  Formal Equivalence Checking and Design Debugging , 1998 .

[22]  Brian Beach Relational Database Service , 2014 .

[23]  Junaid Haroon Siddiqui,et al.  Extending symbolic execution for automated testing of stored procedures , 2019, Software Quality Journal.

[24]  Nikolai Tillmann,et al.  Pex-White Box Test Generation for .NET , 2008, TAP.

[25]  Christoph Weidenbach,et al.  SPASS Version 3.5 , 2009, CADE.

[26]  Ofer Strichman,et al.  Regression Verification: Proving the Equivalence of Similar Programs , 2009, CAV.

[27]  Thomas A. Henzinger,et al.  Handbook of Model Checking , 2018, Springer International Publishing.

[28]  Stéphane Bressan,et al.  Introduction to Database Systems , 2005 .

[29]  Arnd Poetzsch-Heffter,et al.  A fully abstract trace-based semantics for reasoning about backward compatibility of class libraries , 2014, Sci. Comput. Program..

[30]  Véronique Benzaken,et al.  A Coq mechanised formal semantics for realistic SQL queries: formally reconciling SQL and bag relational algebra , 2019, CPP.

[31]  Surajit Chaudhuri,et al.  Optimization of real conjunctive queries , 1993, PODS '93.

[32]  Peiquan Jin,et al.  Exploiting the Performance-Energy Tradeoffs for Mobile Database Applications , 2014, J. Univers. Comput. Sci..

[33]  Greg Nelson,et al.  Extended static checking for Java , 2002, PLDI '02.

[34]  Alvin Cheung,et al.  Axiomatic Foundations and Algorithms for Deciding Semantic Equivalences of SQL Queries , 2018, Proc. VLDB Endow..

[35]  Ramon Lawrence,et al.  Improving SQL query performance on embedded devices using pre-compilation , 2016, SAC.

[36]  Wim Vanhoof,et al.  Relational symbolic execution of SQL code for unit testing of database programs , 2015, Sci. Comput. Program..

[37]  Benjamin Grégoire,et al.  Automation in Computer-Aided Cryptography: Proofs, Attacks and Designs , 2012, CPP.

[38]  Véronique Benzaken,et al.  A Coq Formalisation of SQL's Execution Engines , 2018, ITP.

[39]  Phokion G. Kolaitis,et al.  The containment problem for Real conjunctive queries with inequalities , 2006, PODS '06.

[40]  Maryam Abdul Ghafoor,et al.  Symbolic execution of stored procedures in database management systems , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[41]  Daniel Kroening,et al.  A Tool for Checking ANSI-C Programs , 2004, TACAS.

[42]  Viktor Kuncak,et al.  Development and Evaluation of LAV: An SMT-Based Error Finding Platform - System Description , 2012, VSTTE.

[43]  Boris Motik,et al.  Benchmarking the Chase , 2017, PODS.

[44]  Mihalis Yannakakis,et al.  Equivalences Among Relational Expressions with the Union and Difference Operators , 1980, J. ACM.

[45]  Jeffrey Garbus SAP ASE 16 / Sybase ASE Administration , 2015 .

[46]  Raghu Ramakrishnan,et al.  Containment of conjunctive queries: beyond relations as sets , 1995, TODS.

[47]  Viktor Kuncak,et al.  Software verification and graph similarity for automated evaluation of students' assignments , 2012, Inf. Softw. Technol..

[48]  Torben Amtoft,et al.  A logic for information flow in object-oriented programs , 2006, POPL '06.

[49]  Tao Xie,et al.  Guided test generation for database applications via synthesized database interactions , 2014, ACM Trans. Softw. Eng. Methodol..

[50]  Rada Chirkova Combined-semantics equivalence of conjunctive queries: Decidability and tractability results , 2016, J. Comput. Syst. Sci..

[51]  Andrew P. Black,et al.  How we refactor, and how we know it , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[52]  Stephan Diehl,et al.  Are refactorings less error-prone than other changes? , 2006, MSR '06.

[53]  Peter Rob,et al.  Database systems : design, implementation, and management , 2000 .

[54]  Margus Veanes,et al.  Qex: Symbolic SQL Query Explorer , 2010, LPAR.

[55]  Leonid Libkin,et al.  A Formal Semantics of SQL Queries, Its Validation, and Applications , 2017, Proc. VLDB Endow..

[56]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[57]  黄贻彬,et al.  Microsoft SQL Server中的星形连接查询优化 , 2011 .

[58]  J. Gregory Morrisett,et al.  Toward a verified relational database management system , 2010, POPL '10.

[59]  Giuseppe Pelagatti,et al.  Formal semantics of SQL queries , 1991, TODS.

[60]  Sara Cohen,et al.  Equivalence of queries that are sensitive to multiplicities , 2009, The VLDB Journal.

[61]  Daniel Kroening,et al.  JBMC: Bounded Model Checking for Java Bytecode - (Competition Contribution) , 2019, TACAS.

[62]  Itzik Ben-gan Microsoft SQL Server 2008 T-SQL Fundamentals , 2008 .

[63]  Maydene Fisher,et al.  JDBC¿ API Tutorial and Reference , 2003 .

[64]  Tom Mens,et al.  A survey of software refactoring , 2004, IEEE Transactions on Software Engineering.

[65]  Carsten Sinz,et al.  Proving Functional Equivalence of Two AES Implementations Using Bounded Model Checking , 2009, 2009 International Conference on Software Testing Verification and Validation.

[66]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[67]  Georg Gottlob,et al.  Translating SQL Into Relational Algebra: Optimization, Semantics, and Equivalence of SQL Queries , 1985, IEEE Transactions on Software Engineering.

[68]  Carsten Sinz,et al.  LLBMC: Bounded Model Checking of C and C++ Programs Using a Compiler IR , 2012, VSTTE.

[69]  K. Rustan M. Leino,et al.  Weakest-precondition of unstructured programs , 2005, PASTE '05.

[70]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[71]  Akash Lal,et al.  Optimizing Big-Data Queries Using Program Synthesis , 2017, SOSP.

[72]  Bernd Fischer,et al.  SMT-Based Bounded Model Checking for Embedded ANSI-C Software , 2012, IEEE Transactions on Software Engineering.

[73]  Miryung Kim,et al.  An Empirical Study of RefactoringChallenges and Benefits at Microsoft , 2014, IEEE Transactions on Software Engineering.

[74]  ZimmermannThomas,et al.  An Empirical Study of RefactoringChallenges and Benefits at Microsoft , 2014 .

[75]  Milena Vujosevic-Janicic,et al.  Regression verification for automated evaluation of students programs , 2020, Comput. Sci. Inf. Syst..

[76]  Cesare Tinelli,et al.  Satisfiability Modulo Theories , 2021, Handbook of Satisfiability.

[77]  Dawson R. Engler,et al.  Practical, Low-Effort Equivalence Verification of Real Code , 2011, CAV.

[78]  Noam Rinetzky,et al.  Verifying Equivalence of Spark Programs , 2017, CAV.

[79]  Ofer Strichman,et al.  Regression Verification - A Practical Way to Verify Programs , 2005, VSTTE.

[80]  Clark W. Barrett,et al.  The SMT-LIB Standard Version 2.0 , 2010 .

[81]  Serge Abiteboul,et al.  Foundations of Databases: The Logical Level , 1995 .

[82]  Rupak Majumdar,et al.  Dynamic test input generation for database applications , 2007, ISSTA '07.

[83]  A. Mostowski Review: B. A. Trahtenbrot, Impossibility of an Algorithm for the Decision Problem in Finite Classes , 1950, Journal of Symbolic Logic.

[84]  Joshua S. Auerbach,et al.  Handling Environments in a Nested Relational Algebra with Combinators and an Implementation in a Verified Query Compiler , 2017, SIGMOD Conference.

[85]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.