Multi-source and multimodal data fusion for predicting academic performance in blended learning university courses

Abstract In this paper we apply data fusion approaches for predicting the final academic performance of university students using multiple-source, multimodal data from blended learning environments. We collect and preprocess data about first-year university students from different sources: theory classes, practical sessions, on-line Moodle sessions, and a final exam. Our objective is to discover which data fusion approach produces the best results using our data. We carry out experiments by applying four different data fusion approaches and six classification algorithms. The results show that the best predictions are produced using ensembles and selecting the best attributes approach with discretized data. The best prediction models show us that the level of attention in theory classes, scores in Moodle quizzes, and the level of activity in Moodle forums are the best set of attributes for predicting students’ final performance in our courses.

[1]  Anne M. P. Canuto,et al.  Fusion Approaches of Feature Selection Algorithms for Classification Problems , 2016, 2016 5th Brazilian Conference on Intelligent Systems (BRACIS).

[2]  D. Wolpert The Supervised Learning No-Free-Lunch Theorems , 2002 .

[3]  Sebastián Ventura,et al.  Data mining in course management systems: Moodle case study and tutorial , 2008, Comput. Educ..

[4]  Sebastián Ventura,et al.  Educational data science in massive open online courses , 2016, WIREs Data Mining Knowl. Discov..

[5]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[6]  Juan Antonio Martínez,et al.  Predicting student performance over time. A case study for a blended-learning engineering course , 2019, LASI-SPAIN.

[7]  Cristóbal Romero,et al.  A survey on educational process mining , 2018, WIREs Data Mining Knowl. Discov..

[8]  Nick Z. Zacharis,et al.  Predicting Student Academic Performance in Blended Learning Using Artificial Neural Networks , 2016 .

[9]  Bassam Al-Shargabi,et al.  Developing Big Data Projects in Open University Engineering Courses: Lessons Learned , 2020, IEEE Access.

[10]  Christian Jutten,et al.  Multimodal Data Fusion: An Overview of Methods, Challenges, and Prospects , 2015, Proceedings of the IEEE.

[11]  Yair Levy,et al.  Comparing dropouts and persistence in e-learning courses , 2007, Comput. Educ..

[12]  Jiyi Wu,et al.  A review on sentiment discovery and analysis of educational big‐data , 2020, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..

[13]  R. Azevedo,et al.  The Measurement of Learners’ Self-Regulated Cognitive and Metacognitive Processes While Using Computer-Based Learning Environments , 2010 .

[14]  Marlia Mohd Puteh,et al.  Blended Learning or E-Learning? , 2013, ArXiv.

[15]  Thomas B. Cavanagh,et al.  “Solve the Big Problems” , 2019, Technology Leadership for Innovation in Higher Education.

[16]  Paulo Blikstein,et al.  Multimodal learning analytics , 2013, LAK '13.

[17]  Sebastián Ventura,et al.  A Survey on Pre-Processing Educational Data , 2014 .

[18]  Marcelo Worsley,et al.  Multimodal Learning Analytics and Education Data Mining: using computational technologies to measure complex learning tasks , 2016, J. Learn. Anal..

[19]  Marcelo Worsley,et al.  Multimodal Learning Analytics as a Tool for Bridging Learning Theory and Complex Learning Behaviors , 2014, MLA@ICMI.

[20]  D. Garrison,et al.  Blended learning: Uncovering its transformative potential in higher education , 2004, Internet High. Educ..

[21]  Laurence T. Yang,et al.  A survey on data fusion in internet of things: Towards secure and privacy-preserving fusion , 2019, Inf. Fusion.

[22]  Cristóbal Romero,et al.  Educational data mining and learning analytics: An updated survey , 2020, WIREs Data Mining Knowl. Discov..