Students' Semantic Mistakes in Writing Seven Different Types of SQL Queries

Computer science researchers have studied extensively the mistakes of novice programmers. In comparison, little attention has been given to studying the mistakes of people who are novices at writing database queries. This paper represents the first large scale analysis of students' semantic mistakes in writing different types of SQL SELECT statements. Over 160 thousand snapshots of SQL queries were collected from over 2300 students across nine years. We describe the most common semantic mistakes that these students made when writing different types of SQL statements, and suggest reasons behind those mistakes. We mapped the semantic mistakes we identified in our data to different semantic categories found in the literature. Our findings show that the majority of semantic mistakes are of the type "omission". Most of these omissions happen in queries that require a JOIN, a subquery, or a GROUP BY operator. We conclude that it is important to explicitly teach students techniques for choosing the appropriate type of query when designing a SQL query.

[1]  Xin Zhou,et al.  An open integrated exploratorium for database courses , 2008, ITiCSE.

[2]  John B. Smelcer,et al.  User errors in database query composition , 1995, Int. J. Hum. Comput. Stud..

[3]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[4]  Julia Prior,et al.  AsseSQL: an online, browser-based SQL skills assessment tool , 2014, ITiCSE '14.

[5]  Antonija Mitrovic,et al.  Learning SQL with a computerized tutor , 1998, SIGCSE '98.

[6]  Phyllis Reisner,et al.  Use of Psychological Experimentation as an Aid to Development of a Query Language , 1977, IEEE Transactions on Software Engineering.

[7]  Gerald V. Post Database Management Systems-Designing & Building Business Applications , 1998 .

[8]  Raymond Lister,et al.  A Quantitative Study of the Relative Difficulty for Novices of Writing Seven Different Types of SQL Queries , 2015, ITiCSE.

[9]  Vladimir Zadorozhny,et al.  Learning SQL Programming with Interactive Tools: From Integration to Personalization , 2010, TOCE.

[10]  Antonija Mitrovic,et al.  An Intelligent SQL Tutor on the Web , 2003, Int. J. Artif. Intell. Educ..

[11]  Stefan Brass,et al.  Semantic errors in SQL queries: A quite complete list , 2006, J. Syst. Softw..

[12]  Charles Welty,et al.  Correcting User Errors in SQL , 1985, Int. J. Man Mach. Stud..

[13]  Stefan Brass,et al.  Semantic errors in SQL queries: a quite complete list , 2004, Fourth International Conference onQuality Software, 2004. QSIC 2004. Proceedings..

[14]  Raymond Lister,et al.  Students' Syntactic Mistakes in Writing Seven Different Types of SQL Queries and its Application to Predicting Students' Success , 2016, SIGCSE.

[15]  David W. Stemple,et al.  Human factors comparison of a procedural and a nonprocedural query language , 1981, TODS.

[16]  R. B. Buitendijk Logical errors in database SQL retrieval queries , 1988 .

[17]  Raymond Lister,et al.  The backwash effect on SQL skills grading , 2004, ITiCSE '04.