Using Keystroke Analytics to Improve Pass-Fail Classifiers

Learning analytics offers insights into student behaviour and the potential to detect poor performers before they fail exams. If the activity is primarily online (for example computer programming), a wealth of low-level data can be made available that allows unprecedented accuracy in predicting which students will pass or fail. In this paper, we present a classification system for early detection of poor performers based on student effort data, such as the complexity of the programs they write, and show how it can be improved by the use of low-level keystroke analytics.

[1]  Tuba Yilmaz,et al.  Student perceptions of computer science: a retention study comparing graduating seniors with cs leavers , 2008, SIGCSE '08.

[2]  Kevin Casey,et al.  Mining Moodle to understand Student Behaviour , 2010 .

[3]  Richard A. Berk Classification and Regression Trees (CART) , 2008 .

[4]  Mohd Sapiyan,et al.  Measuring Cognitive Load- A solution to ease Learning of Programming , 2007 .

[5]  Brett A. Becker An Exploration Of The Effects Of Enhanced Compiler Error Messages For Computer Programming Novices , 2015 .

[6]  Christopher Ré,et al.  Brainwash: A Data System for Feature Engineering , 2013, CIDR.

[7]  Carlos Delgado Kloos,et al.  Monitoring student progress using virtual appliances: A case study , 2012, Comput. Educ..

[8]  M. Ragan-Kelley,et al.  The Jupyter/IPython architecture: a unified view of computational research, from interactive exploration to communication and publication. , 2014 .

[9]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[10]  Jeffrey T. Steedle,et al.  Working memory, fluid intelligence, and science learning , 2006 .

[11]  Raymond Lister After the gold rush: toward sustainable scholarship in computing , 2008, ACE '08.

[12]  Ryan Shaun Joazeiro de Baker,et al.  Towards Predicting Future Transfer of Learning , 2011, AIED.

[13]  D. Feitelson,et al.  Quantification of Code Regularity Using Preprocessing and Compression , 2014 .

[14]  Jacob Slonim,et al.  Crossroads for Canadian CS enrollment , 2008, Commun. ACM.

[15]  Taylor Martin,et al.  Using Learning Analytics to Understand the Learning Pathways of Novice Programmers , 2013 .

[16]  Neil Brown,et al.  Blackbox: a large scale repository of novice programmers' activity , 2014, SIGCSE.

[17]  Alina A. von Davier,et al.  Cross-Validation , 2014 .

[18]  Aidan Mooney,et al.  An Overview of the Integration of Problem Based Learning into an existing Computer Science Programming Module , 2004 .

[19]  Neil T. Heffernan,et al.  Predicting State Test Scores Better with Intelligent Tutoring Systems: Developing Metrics to Measure Assistance Required , 2006, Intelligent Tutoring Systems.

[20]  Steven Furnell,et al.  A Long-term Trial of Keystroke Profiling using Digraph, Trigraph and Keyword Latencies , 2004, SEC.

[21]  Raymond Lister,et al.  Exploring Machine Learning Methods to Automatically Identify Students in Need of Assistance , 2015, ICER.

[22]  David E. Pritchard,et al.  Studying Learning in the Worldwide Classroom Research into edX's First MOOC. , 2013 .

[23]  Claudia Picardi,et al.  Identity verification through dynamic keystroke analysis , 2003, Intell. Data Anal..

[24]  Amela Karahasanovic,et al.  An Investigation into Keystroke Latency Metrics as an Indicator of Programming Performance , 2005, ACE.

[25]  Ryan Shaun Joazeiro de Baker,et al.  Automatically Detecting a Student's Preparation for Future Learning: Help Use is Key , 2011, EDM.

[26]  Dapeng Liu,et al.  An Empirical Study of Programming Performance Based on Keystroke Characteristics , 2011 .

[27]  Brett A. Becker,et al.  Effective compiler error message enhancement for novice programming students , 2016, Comput. Sci. Educ..

[28]  T. Fearn,et al.  Classification and Regression Trees (CART) , 2020, Statistical Learning from a Regression Perspective.

[29]  Henrik Nygren,et al.  Identification of programmers from typing patterns , 2015, Koli Calling.

[30]  Zachary A. Pardos,et al.  Adapting Bayesian Knowledge Tracing to a Massive Open Online Course in edX , 2013, EDM.

[31]  Kevin Thorn,et al.  Should Instructional Designers care about the Tin Can API? , 2013, ELERN.

[32]  John Mason,et al.  Why the high attrition rate for computer science students: some thoughts and observations , 2005, SGCS.

[33]  Carlos Delgado Kloos,et al.  Key Action Extraction for Learning Analytics , 2012, EC-TEL.

[34]  Judy McKay,et al.  Seven factors that influence ICT student achievement , 2007, ITiCSE '07.

[35]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[36]  George Siemens,et al.  Penetrating the fog: analytics in learning and education , 2014 .

[37]  Stuart K Garner Reducing the Cognitive Load on Novice Programmers , 2002 .

[38]  Xavier Ochoa,et al.  Expertise estimation based on simple multimodal features , 2013, ICMI '13.

[39]  Paul Roe,et al.  Learning to Program: Going Pair-Shaped , 2007 .

[40]  Regan L. Mandryk,et al.  Identifying emotional states using keystroke dynamics , 2011, CHI.

[41]  Beverly Park Woolf,et al.  On-line Tutoring for Math Achievement Testing: A Controlled Evaluation , 2007 .

[42]  David E. Pritchard,et al.  Correlating skill and improvement in 2 MOOCs with a student's time on tasks , 2014, L@S.

[43]  S. Dunne,et al.  Initial findings on the impact of an alternative approach to Problem Based Learning in Conputer Science , 2004 .

[44]  Matthew D. Pistilli,et al.  Course signals at Purdue: using learning analytics to increase student success , 2012, LAK.