The relationship between user query accuracy and lines of code

In experimental studies on query languages, subjects are required to write queries using different query languages. User query performance is usually measured by query accuracy. There is no clearly defined objective method of applying findings to other queries. This study examines the suitability of using a software metric based on lines of code to estimate user query accuracy. Lines of code have been measured in various ways, such as physical source code lines, logical source code lines or compiled bytes. A method of counting lines of code for database queries is proposed and applied to two query languages. The new method counts Boolean conditions as well as other statements. The relationship between lines of code and user query accuracy was examined with regression models. The results show that lines of code can explain a high percentage of the variance in accuracy, with R2>0.8 for the standard relational model query language SQL, and R2>0.9 for the entity relationship model query language KQL. The common assumption that more lines of code will lead to lower accuracy is only partly validated. The findings show a nonlinear relationship, with a possible recovery in accuracy for queries with many lines of code. The results indicate that lines of code can be usefully applied in the study of query languages.

[1]  Jerry Waxman,et al.  A Study of Three Database Query Languages , 1978, JCDKB.

[2]  John B. Smelcer,et al.  User errors in database query composition , 1995, Int. J. Hum. Comput. Stud..

[3]  Hock Chuan Chan,et al.  User-database interaction at the knowledge level of abstraction , 1997, Inf. Softw. Technol..

[4]  Richard W. Scamell,et al.  A Human Factors Experimental Comparison of SQL and QBE , 1993, IEEE Trans. Software Eng..

[5]  Ronald P. Cody,et al.  Applied Statistics and the SAS Programming Language , 1986 .

[6]  Kil-Soo Suh,et al.  A Comparison of Linear Keyword and Restricted Natural Language Data Base Interfaces for Novice Users , 1992, Inf. Syst. Res..

[7]  David A. Bradbard,et al.  The Effects of Relational and Entity-Relationship Data Models on Query Performance of End Users , 1989, Int. J. Man Mach. Stud..

[8]  Peter P. Chen The Entity-Relationship Model: Towards a unified view of Data , 1976 .

[9]  Peter P. Chen The entity-relationship model: toward a unified view of data , 1975, VLDB '75.

[10]  A. Newell Unified Theories of Cognition , 1990 .

[11]  Keng Siau,et al.  User-Database Interface: The Effect of Abstraction Levels on Query Performance , 1993, MIS Q..

[12]  Charles Welty,et al.  Correcting User Errors in SQL , 1985, Int. J. Man Mach. Stud..

[13]  H. E. Dunsmore,et al.  Software engineering metrics and models , 1986 .

[14]  Maurice H. Halstead,et al.  Elements of software science , 1977 .

[15]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[16]  Bill Curtis,et al.  Measuring the Psychological Complexity of Software Maintenance Tasks with the Halstead and McCabe Metrics , 1979, IEEE Transactions on Software Engineering.

[17]  Bill Curtis,et al.  Third time charm: Stronger prediction of programmer performance by software complexity metrics , 1979, ICSE 1979.

[18]  Robert B. Grady,et al.  Practical Software Metrics for Project Management and Process Improvement , 1992 .