Improving the expressiveness of black-box models for predicting student performance

Early prediction systems of student performance can be very useful to guide student learning. For a prediction model to be really useful as an effective aid for learning, it must provide tools to adequately interpret progress, to detect trends and behaviour patterns and to identify the causes of learning problems. White-box and black-box techniques have been described in literature to implement prediction models. White-box techniques require a priori models to explore, which make them easy to interpret but difficult to be generalized and unable to detect unexpected relationships between data. Black-box techniques are easier to generalize and suitable to discover unsuspected relationships but they are cryptic and difficult to be interpreted for most teachers. In this paper a black-box technique is proposed to take advantage of the power and versatility of these methods, while making some decisions about the input data and design of the classifier that provide a rich output data set. A set of graphical tools is also proposed to exploit the output information and provide a meaningful guide to teachers and students. From our experience, a set of tips about how to design a prediction system and the representation of the output information is also provided. Black-box classifiers are proposed to predict student performance.Black-box classifiers are powerful and generalizable but difficult to interpret.Some tips about their design are proposed to improve their expressiveness.Some graphical tools are proposed to exploit the expressiveness and help students.

[1]  Daniel Neagu,et al.  Interpreting random forest models using a feature contribution method , 2013, 2013 IEEE 14th International Conference on Information Reuse & Integration (IRI).

[2]  Rui Guo,et al.  Participation-based student final performance prediction model through interpretable Genetic Programming: Integrating learning analytics, educational data mining and theory , 2015, Comput. Hum. Behav..

[3]  Shane Dawson,et al.  Mining LMS data to develop an "early warning system" for educators: A proof of concept , 2010, Comput. Educ..

[4]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[5]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[6]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[7]  Shihong Huang,et al.  Work in progress: A machine learning approach for assessment and prediction of teamwork effectiveness in software engineering education , 2012, 2012 Frontiers in Education Conference Proceedings.

[8]  Peter R. Turner,et al.  Predictive assessment of student performance for early strategic guidance , 2011, 2011 Frontiers in Education Conference (FIE).

[9]  Rafael Molina-Carmona,et al.  Boosting the Learning Process with Progressive Performance Prediction , 2015, EC-TEL.

[10]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[11]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[12]  Sotiris B. Kotsiantis Use of machine learning techniques for educational proposes: a decision support system for forecasting students’ grades , 2011, Artificial Intelligence Review.

[13]  Rafael Molina-Carmona,et al.  PREDICTING ACADEMIC PERFORMANCE FROM BEHAVIOURAL AND LEARNING DATA , 2016 .

[14]  Chia-Lun Lo,et al.  Developing early warning systems to predict students' online learning performance , 2014, Comput. Hum. Behav..

[15]  Anna Szczepańska Research Design and Statistical Analysis, Third Edition by Jerome L. Myers, Arnold D. Well, Robert F. Lorch, Jr , 2011 .

[16]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[17]  Ioanna Lykourentzou,et al.  Early and dynamic student achievement prediction in e-learning courses using neural networks , 2009 .

[18]  C. Villagrá Arnedo,et al.  REAL-TIME EVALUATION , 2009 .

[19]  Neil T. Heffernan,et al.  Predicting College Enrollment from Student Interaction with an Intelligent Tutoring System in Middle School , 2013, EDM.

[20]  Tobias Ley,et al.  Which User Interactions Predict Levels of Expertise in Work-Integrated Learning? , 2013, EC-TEL.

[21]  Jerome L. Myers,et al.  Research Design and Statistical Analysis , 1991 .

[22]  Shaobo Huang,et al.  Predicting student academic performance in an engineering dynamics course: A comparison of four types of predictive mathematical models , 2013, Comput. Educ..

[23]  Leland Wilkinson,et al.  The History of the Cluster Heat Map , 2009 .

[24]  Alex J. Bowers Analyzing the Longitudinal K-12 Grading Histories of Entire Cohorts of Students: Grades, Data Driven Decision Making, Dropping out and Hierarchical Cluster Analysis. , 2010 .

[25]  Ryan Shaun Joazeiro de Baker,et al.  Educational Data Mining and Learning Analytics: Applications to Constructionist Research , 2014, Technology, Knowledge and Learning.

[26]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[27]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[28]  Motoaki Kawanabe,et al.  How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[29]  Wilhelmiina Hämäläinen,et al.  Comparison of Machine Learning Methods for Intelligent Tutoring Systems , 2006, Intelligent Tutoring Systems.

[30]  Andreas Zell,et al.  Interpreting linear support vector machine models with heat map molecule coloring , 2011, J. Cheminformatics.