A Static Code Smell Detector for SQL Queries Embedded in Java Code

A database plays a central role in the architecture of an information system, and the way it stores the data delimits its main features. However, it is not just the data that matters. The way it is handled, i.e., how the application communicates with the database is of critical importance too. Therefore the implementation of such a communication layer has to be reliable and efficient. SQL is a popular language to query a database, and modern technologies rely on it (or its dialects) as query strings embedded in the application code. In many languages (e.g. in Java), an embedded query is typically constructed through several string operations that obstruct developers in understanding the statement finally sent to the database. It is a potential source of fault-prone and inefficient database usage, i.e., code smells. In our paper, we present a tool for the identification of code smells in SQL queries embedded in Java code. Our tool implements a combined static analysis of the SQL statements embedded in the source code, the database schema, and the data in the database. We use a lightweight query extraction algorithm to extract SQL code from the Java code and implement smell detectors on the ASG of our fault-tolerant SQL parser. Depending on the context of the smell, its severity is also determined. Developers can examine the identified issues with the help of an Eclipse plug-in or through command line interfaces.

[1]  B. B. Meshram,et al.  Analysis of different technique for detection of SQL injection , 2011, ICWET.

[2]  Premkumar T. Devanbu,et al.  Static checking of dynamically generated queries in database applications , 2004, Proceedings. 26th International Conference on Software Engineering.

[3]  Aske Simon Christensen,et al.  Precise Analysis of String Expressions , 2003, SAS.

[4]  Premkumar T. Devanbu,et al.  JDBC checker: a static analysis tool for SQL/JDBC applications , 2004, Proceedings. 26th International Conference on Software Engineering.

[5]  Charles Welty,et al.  Correcting User Errors in SQL , 1985, Int. J. Man Mach. Stud..

[6]  Tom Mens,et al.  CodeCritics applied to database schema: Challenges and first results , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[7]  Anthony Cleve,et al.  Mining Stack Overflow for discovering error patterns in SQL queries , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[8]  C. Goldberg DO YOU KNOW SQL ? ABOUT SEMANTIC ERRORS IN DATABASE QUERIES , 2009 .

[9]  Bill Karwin,et al.  SQL Antipatterns: Avoiding the Pitfalls of Database Programming , 2010 .

[10]  Raymond Lister,et al.  Students' Semantic Mistakes in Writing Seven Different Types of SQL Queries , 2016, ITiCSE.

[11]  Shigeichi Hirasawa,et al.  On automatic detection of SQL injection attacks by the feature extraction of the single character , 2011, SIN '11.

[12]  Varmo Vene,et al.  An Interactive Tool for Analyzing Embedded SQL Queries , 2010, APLAS.

[13]  Gregory M. Kapfhammer,et al.  Localizing SQL faults in database applications , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[14]  Tom Mens,et al.  Towards a survival analysis of database framework usage in Java projects , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[15]  Raymond Lister,et al.  A Quantitative Study of the Relative Difficulty for Novices of Writing Seven Different Types of SQL Queries , 2015, ITiCSE.

[16]  Joost Visser,et al.  Quality Assessment for Embedded SQL , 2007 .

[17]  Anthony Cleve,et al.  Static Analysis of Dynamic Database Usage in Java Systems , 2016, CAiSE.

[18]  Stefan Brass,et al.  Semantic errors in SQL queries: A quite complete list , 2006, J. Syst. Softw..

[19]  Stefan Brass,et al.  Detecting Logical Errors in SQL Queries , 2004, Grundlagen von Datenbanken.

[20]  Anthony Cleve,et al.  Where was this SQL query executed? a static concept location approach , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[21]  A. Faye Borthick,et al.  Improving New Users’ Query Performance: Deterring Premature Stopping of Query Revision with Information for Forming Ex Ante Expectations , 2012, JDIQ.

[22]  David Anderson,et al.  Query Construction Patterns in PHP , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).