How Flexible Is Your Data? A Comparative Analysis of Scoring Methodologies across Learning Platforms in the Context of Group Differentiation

Data is flexible in that it is molded by not only the features and variables available to a researcher for analysis and interpretation, but also by how those features and variables are recorded and processed prior to evaluation. “Big Data” from online learning platforms and intelligent tutoring systems is no different. The work presented herein questions the quality and flexibility of data from two popular learning platforms, comparing binary measures of problem-level accuracy, the scoring method typically used to inform learner analytics, with partial credit scoring, a more robust, real-world methodology. This work extends previous research by examining how the manipulation of scoring methodology has the potential to alter outcomes when testing hypotheses, or specifically, when looking for significant differences between groups of students. Datasets from ASSISTments and Cognitive Tutor are used to assess the implications of data availability and manipulation within twelve mathematics skills. A resampling approach is used to determine the size of equivalent samples of high- and low-performing students required to reliably differentiate performance when considering each scoring methodology. Results suggest that in eleven out of twelve observed skills, partial credit offers more efficient group differentiation, increasing analytic power and reducing Type II error. Alternative applications of this approach and implications for the Learning Analytics community are discussed.

[1]  Donald E. Powers,et al.  Immediate Feedback and Opportunity to Revise Answers to Open-Ended Questions , 2010 .

[2]  Kenneth R. Koedinger,et al.  Individualized Bayesian Knowledge Tracing Models , 2013, AIED.

[3]  Neil T. Heffernan,et al.  Blocking Vs. Interleaving: Examining Single-Session Effects Within Middle School Math Homework , 2015, AIED.

[4]  Neil T. Heffernan,et al.  Optimizing Partial Credit Algorithms to Predict Student Performance , 2015, EDM.

[5]  Yigal Attali,et al.  EFFECT OF IMMEDIATE FEEDBACK AND REVISION ON PSYCHOMETRIC PROPERTIES OF OPEN‐ENDED GRE® SUBJECT TEST ITEMS , 2008 .

[6]  Neil T. Heffernan,et al.  Enhancing the efficiency and reliability of group differentiation through partial credit , 2016, LAK.

[7]  John R. Anderson,et al.  Cognitive Tutors: Lessons Learned , 1995 .

[8]  Neil T. Heffernan,et al.  The ASSISTments Ecosystem: Building a Platform that Brings Scientists and Teachers Together for Minimally Invasive Research on Human Learning and Teaching , 2014, International Journal of Artificial Intelligence in Education.

[9]  Kurt VanLehn,et al.  The Andes Physics Tutoring System: Lessons Learned , 2005, Int. J. Artif. Intell. Educ..

[10]  Neil T. Heffernan,et al.  Improving Student Modeling Through Partial Credit and Problem Difficulty , 2015, L@S.

[11]  Michel C. Desmarais,et al.  A review of recent advances in learner and skill modeling in intelligent learning environments , 2012, User Modeling and User-Adapted Interaction.

[12]  Ahmed Elragal,et al.  Big Data Analytics: A Literature Review Paper , 2014, ICDM.

[13]  Ryan Shaun Joazeiro de Baker,et al.  DataShop: A Data Repository and Analysis Service for the Learning Science Community (Interactive Event) , 2011, AIED.

[14]  Yigal Attali,et al.  Immediate Feedback and Opportunity to Revise Answers , 2011 .

[15]  Charles A. O'Reilly,et al.  Variations in Decision Makers' Use of Information Sources: The Impact of Quality and Accessibility of Information. , 1980 .

[16]  Zachary A. Pardos,et al.  Modeling Individualization in a Bayesian Networks Implementation of Knowledge Tracing , 2010, UMAP.

[17]  Wg.Cdr. Pongphet Congpuong How to lie With Statistics , 2013 .

[18]  John R. Anderson,et al.  Knowledge tracing: Modeling the acquisition of procedural knowledge , 2005, User Modeling and User-Adapted Interaction.

[19]  M. Janssen,et al.  Factors influencing big data decision-making quality , 2017 .

[20]  Albert T. Corbett,et al.  Cognitive Tutor: Applied research in mathematics education , 2007, Psychonomic bulletin & review.

[21]  Neil T. Heffernan,et al.  The "Assistance" Model: Leveraging How Many Hints and Attempts a Student Needs , 2011, FLAIRS Conference.

[22]  Tamraparni Dasu,et al.  Statistical Distortion: Consequences of Data Cleaning , 2012, Proc. VLDB Endow..

[23]  Kevin Lane Keller,et al.  Effects of Quality and Quantity of Information on Decision Effectiveness , 1987 .

[24]  Zachary A. Pardos,et al.  KT-IDEM: introducing item difficulty to the knowledge tracing model , 2011, UMAP'11.

[25]  Susan Eitelman,et al.  Matlab Version 6.5 Release 13. The MathWorks, Inc., 3 Apple Hill Dr., Natick, MA 01760-2098; 508/647-7000, Fax 508/647-7001, www.mathworks.com , 2003 .

[26]  Neil T. Heffernan,et al.  Extending Knowledge Tracing to Allow Partial Credit: Using Continuous versus Binary Nodes , 2013, AIED.