Predicting Secondary School Students' Performance Utilizing a Semi-supervised Learning Approach

Educational data mining constitutes a recent research field which gained popularity over the last decade because of its ability to monitor students' academic performance and predict future progression. Numerous machine learning techniques and especially supervised learning algorithms have been applied to develop accurate models to predict student's characteristics which induce their behavior and performance. In this work, we examine and evaluate the effectiveness of two wrapper methods for semisupervised learning algorithms for predicting the students' performance in the final examinations. Our preliminary numerical experiments indicate that the advantage of semisupervised methods is that the classification accuracy can be significantly improved by utilizing a few labeled and many unlabeled data for developing reliable prediction models.

[1]  V. Ramesh,et al.  Predicting Student Performance: A Statistical and Data Mining Approach , 2013 .

[2]  Anal Acharya,et al.  An Intelligent Web-Based System for Diagnosing Student Learning Problems Using Concept Maps , 2017 .

[3]  Francisco Herrera,et al.  On the characterization of noise filters for self-training semi-supervised in nearest neighbor classification , 2014, Neurocomputing.

[4]  Banu Diri,et al.  Unlabelled extra data do not always mean extra performance for semi‐supervised fault prediction , 2009, Expert Syst. J. Knowl. Eng..

[5]  Sebastián Ventura,et al.  Educational data mining: A survey from 1995 to 2005 , 2007, Expert Syst. Appl..

[6]  George Karypis,et al.  Predicting Student Performance Using Personalized Analytics , 2016, Computer.

[7]  Saso Dzeroski,et al.  Semi-Supervised Learning for Quantitative Structure-Activity Modeling , 2013, Informatica.

[8]  Friedhelm Schwenker,et al.  Pattern classification and clustering: A review of partially supervised learning approaches , 2014, Pattern Recognit. Lett..

[9]  Yoshitaka Sakurai,et al.  Modeling Didactic Knowledge by Storyboarding , 2010 .

[10]  Francisco Herrera,et al.  Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study , 2015, Knowledge and Information Systems.

[11]  Songcan Chen,et al.  Safety-Aware Semi-Supervised Classification , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Pong C. Yuen,et al.  A Boosted Co-Training Algorithm for Human Action Recognition , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[14]  Neil T. Heffernan,et al.  A Quasi-Experimental Evaluation of An On-Line Formative Assessment and Tutoring System , 2010 .

[15]  Sebastián Ventura,et al.  Educational Data Mining: A Review of the State of the Art , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[16]  Gwo-Dong Chen,et al.  Discovering Decision Knowledge from Web Log Portfolio for Managing Classroom Processes by Applying Decision Tree and Data Cube Technology , 2000 .

[17]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[18]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[19]  Yukon Chang,et al.  Identifying Engineering Students’ English Sentence Reading Comprehension Errors , 2016 .