Insights from Student Solutions to SQL Homework Problems

We analyze the submissions of 286 students as they solved Structured Query Language (SQL) homework assignments for an upper-level databases course. Databases and the ability to query them are becoming increasingly essential for not only computer scientists but also business professionals, scientists, and anyone who needs to make data-driven decisions. Despite the increasing importance of SQL and databases, little research has documented student difficulties in learning SQL. We replicate and extend prior studies of students' difficulties with learning SQL. Students worked on and submitted their homework through an online learning management system with support for autograding of code. Students received immediate feedback on the correctness of their solutions and had approximately a week to finish writing eight to ten queries. We categorized student submissions by the type of error, or lack thereof, that students made, and whether the student was eventually able to construct a correct query. Like prior work, we find that the majority of student mistakes are syntax errors. In contrast with the conclusions of prior work, we find that some students are never able to resolve these syntax errors to create valid queries. Additionally, we find that students struggle the most when they need to write SQL queries related to GROUP BY and correlated subqueries. We suggest implications for instruction and future research.

[1]  Tero Vartiainen,et al.  Errors and Complications in SQL Query Formulation , 2018, ACM Trans. Comput. Educ..

[2]  Raymond Lister,et al.  Students' Syntactic Mistakes in Writing Seven Different Types of SQL Queries and its Application to Predicting Students' Success , 2016, SIGCSE.

[3]  Vladimir Zadorozhny,et al.  Learning SQL Programming with Interactive Tools: From Integration to Personalization , 2010, TOCE.

[4]  Brian Dorn,et al.  Aggregate Compilation Behavior: Findings and Implications from 27,698 Users , 2015, ICER.

[5]  Phyllis Reisner,et al.  Use of Psychological Experimentation as an Aid to Development of a Query Language , 1977, IEEE Transactions on Software Engineering.

[6]  Toni Taipalus,et al.  What to Expect and What to Focus on in SQL Query Teaching , 2019, SIGCSE.

[7]  Philip J. Guo,et al.  OverCode: visualizing variation in student solutions to programming problems at scale , 2014, ACM Trans. Comput. Hum. Interact..

[8]  Raymond Lister,et al.  A Quantitative Study of the Relative Difficulty for Novices of Writing Seven Different Types of SQL Queries , 2015, ITiCSE.

[9]  Antonija Mitrovic,et al.  An Intelligent SQL Tutor on the Web , 2003, Int. J. Artif. Intell. Educ..

[10]  Matthew C. Jadud,et al.  Methods and tools for exploring novice compilation behaviour , 2006, ICER '06.

[11]  Shriram Krishnamurthi,et al.  Error messages are classifiers: a process to design and evaluate error messages , 2017, Onward!.

[12]  Philip J. Guo Online python tutor: embeddable web-based program visualization for cs education , 2013, SIGCSE '13.

[13]  Raymond Lister,et al.  Students' Semantic Mistakes in Writing Seven Different Types of SQL Queries , 2016, ITiCSE.

[14]  Phyllis Reisner,et al.  Human Factors Studies of Database Query Languages: A Survey and Assessment , 1981, CSUR.

[15]  David W. Stemple,et al.  Human factors comparison of a procedural and a nonprocedural query language , 1981, TODS.

[16]  Amey Karkare,et al.  Unexpected Tokens: A Review of Programming Error Messages and Design Guidelines for the Future , 2019, ITiCSE.

[17]  Brett A. Becker An Effective Approach to Enhancing Compiler Error Messages , 2016, SIGCSE.

[18]  Craig Zilles,et al.  PrairieLearn: Mastery-based Online Problem Solving with Adaptive Scoring and Recommendations Driven by Machine Learning , 2015 .

[19]  Jennifer Widom,et al.  Database Systems: The Complete Book , 2001 .

[20]  Amy J. Ko,et al.  A Systematic Investigation of Replications in Computing Education Research , 2019, ACM Trans. Comput. Educ..