Population validity for educational data mining models: A case study in affect detection

Information and communication technology (ICT)-enhanced research methods such as educational data mining (EDM) have allowed researchers to effectively model a broad range of constructs pertaining to the student, moving from traditional assessments of knowledge to assessment of engagement, meta-cognition, strategy and affect. The automated detection of these constructs allows EDM researchers to develop intervention strategies that can be implemented either by the software or the teacher. It also allows for secondary analyses of the construct, where the detectors are applied to a data set that is much larger than one that could be analyzed by more traditional methods. However, in many cases, the data used to develop EDM models are collected from students who may not be representative of the broader populations who are likely to use ICT. In order to use EDM models (automated detectors) with new populations, their generalizability must be verified. In this study, we examine whether detectors of affect remain valid when applied to new populations. Models of four educationally relevant affective states were constructed based on data from urban, suburban and rural students using ASSISTments software for middle school mathematics in the Northeastern United States. We found that affect detectors trained on a population drawn primarily from one demographic grouping do not generalize to populations drawn primarily from the other demographic groupings, even though those populations might be considered part of the same national or regional culture. Models constructed using data from all three subpopulations are more applicable to students in those populations than those trained on a single group, but still do not achieve ideal population validity—the ability to generalize across all subgroups. In particular, models generalize better across urban and suburban students than rural students. These findings have important implications for data collection efforts, validation techniques, and the design of interventions that are intended to be applied at scale.

[1]  Kate Thompson,et al.  Collaborative learning by modelling: Observations in an online setting , 2007 .

[2]  Neil T. Heffernan,et al.  Towards an Understanding of Affect and Knowledge from Student Interaction with an Intelligent Tutoring System , 2013, AIED.

[3]  Roberto Colom Marañón,et al.  Algunos «mitos» de la Psicología: entre la ciencia y la ideología , 2000 .

[4]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[5]  Neil T. Heffernan,et al.  Which Is More Responsible for Boredom in Intelligent Tutoring Systems: Students (Trait) or Problems (State)? , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[6]  Fiorella de Rosis,et al.  Introduction to special Issue on ‘Affective modeling and adaptation’ , 2008, User Modeling and User-Adapted Interaction.

[7]  Emmanuel G. Blanchard,et al.  On the WEIRD Nature of ITS/AIED Conferences - A 10 Year Longitudinal Study Analyzing Potential Cultural Biases , 2012, ITS.

[8]  Arnon Hershkovitz,et al.  Discovery With Models , 2013 .

[9]  Ryan Shaun Joazeiro de Baker,et al.  Improving construct validity yields better models of systematic inquiry, even with less information , 2012, UMAP.

[10]  Ryan Shaun Joazeiro de Baker,et al.  Detecting Carelessness through Contextual Estimation of Slip Probabilities among Students Using an Intelligent Tutor for Mathematics , 2011, AIED.

[11]  Angel A. Juan,et al.  Collaborative and Distributed E-Research: Innovations in Technologies, Strategies and Applications , 2012 .

[12]  Rosalind W. Picard,et al.  An affective model of interplay between emotions and learning: reengineering educational pedagogy-building a learning companion , 2001, Proceedings IEEE International Conference on Advanced Learning Technologies.

[13]  Ton Mooij,et al.  Pupil-centred learning, ICT, and teacher behaviour: observations in educational practice , 2001, Br. J. Educ. Technol..

[14]  Shouping Hu,et al.  Educational Aspirations and Postsecondary Access and Choice: Students in Urban, Suburban, and Rural Schools Compared. , 2003 .

[15]  Christa Reiser,et al.  Social stratification and the digital divide , 2003 .

[16]  Natasha Anne Rappa,et al.  The role of teacher, student and ICT in enhancing student engagement in multiuser virtual environments , 2009, Br. J. Educ. Technol..

[17]  Ingo Mierswa,et al.  YALE: rapid prototyping for complex data mining tasks , 2006, KDD '06.

[18]  J. Connell,et al.  What motivates children's behavior and emotion? Joint effects of perceived control and autonomy in the academic domain. , 1993, Journal of personality and social psychology.

[19]  Mar ianne Mise,et al.  Children Who Do Well in School : Individual Differences in Perceived Competence and Autonomy in Above-Average Children , 2001 .

[20]  Neil T. Heffernan,et al.  A Comparison of Traditional Homework to Computer-Supported Homework , 2009 .

[21]  Scotty D. Craig,et al.  Affect and learning: An exploratory look into the role of affect in learning with AutoTutor , 2004 .

[22]  Zachary A. Pardos,et al.  Affective states and state tests: investigating how affect throughout the school year predicts end of year learning outcomes , 2013, LAK '13.

[23]  Shouping Hu Educational Aspirations and Postsecondary Access and Choice , 2003 .

[24]  James C. Lester,et al.  Modeling Learner Affect with Theoretically Grounded Dynamic Bayesian Networks , 2011, ACII.

[25]  Ryan Shaun Joazeiro de Baker,et al.  Collaboration in cognitive tutor use in latin America: field study and design recommendations , 2012, CHI.

[26]  Vincent Aleven,et al.  Sensor-free automated detection of affect in a Cognitive Tutor for Algebra , 2012, EDM.

[27]  I-Fan Liu,et al.  Research on the effectiveness of information technology in reducing the Rural-Urban Knowledge Divide , 2013, Comput. Educ..

[28]  David L. Johnson Computer Tutors Get Personal. , 2005 .

[29]  Sebastián Ventura,et al.  Educational data mining: A survey from 1995 to 2005 , 2007, Expert Syst. Appl..

[30]  Arthur C. Graesser,et al.  Automatic detection of learner’s affect from conversational cues , 2008, User Modeling and User-Adapted Interaction.

[31]  George Siemens,et al.  Learning analytics and educational data mining: towards communication and collaboration , 2012, LAK.

[32]  Neil T. Heffernan,et al.  Predicting College Enrollment from Student Interaction with an Intelligent Tutoring System in Middle School , 2013, EDM.

[33]  Rebecca Ferguson,et al.  Learning analytics: drivers, developments and challenges , 2012 .

[34]  Ryan Shaun Joazeiro de Baker,et al.  A Cross-Cultural Comparison of Effective Help-Seeking Behavior among Students Using an ITS for Math , 2012, ITS.

[35]  Francesco Ricci,et al.  User Modeling, Adaptation, and Personalization , 2014, Lecture Notes in Computer Science.

[36]  Robert J. Mislevy,et al.  Putting ECD into Practice: The Interplay of Theory and Data in Evidence Models within a Digital Learning Environment , 2012, EDM 2012.

[37]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[38]  Robert J. Mislevy,et al.  Evidence-Centered Design of Epistemic Games: Measurement Principles for Complex Learning Environments. , 2010 .

[39]  Harold H. Wenglinsky Does It Compute? The Relationship between Educational Technology and Student Achievement in Mathematics. , 1998 .

[40]  Mary Anne Kennan,et al.  Investigating eResearch: Collaboration Practices and Future Challenges , 2012 .

[41]  Jing Luan,et al.  Data mining: Going beyond traditional statistics , 2006 .

[42]  Benjamin Naumann The Architecture Of Cognition , 2016 .

[43]  J. Cohn,et al.  A Psychometric Evaluation of the Facial Action Coding System for Assessing Spontaneous Expression , 2001 .

[44]  D. P. Rovira,et al.  Theoretical and methodological aspects of cross-cultural research , 2000 .

[45]  Ryan Shaun Joazeiro de Baker,et al.  The Relationship between Carelessness and Affect in a Cognitive Tutor , 2011, ACII.

[46]  Lauren E. Provost,et al.  Mathematics achievement gaps between suburban students and their rural and urban peers increase over time , 2012 .

[47]  N. Jo Campbell,et al.  Computer Anxiety of Rural Middle and Secondary School Students , 1989 .