Prediction of Coding Intricacy in a Software Engineering Team through Machine Learning to Ensure Cooperative Learning and Sustainable Education

Coding deliverables are vital part of the software project. Teams are formed to develop a software project in a term. The performance of the team for each milestone results in the success or failure of the project. Coding intricacy is a major issue faced by students as coding is believed to be a complex field demanding skill and practice. Future education demands a smart environment for understanding students. Prediction of the coding intricacy level in teams can assist in cultivating a cooperative educational environment for sustainable education. This study proposed a boosting-based approach of a random forest (RF) algorithm of machine learning (ML) for predicting the coding intricacy level among software engineering teams. The performance of the proposed approach is compared with viable ML algorithms to evaluate its excellence. Results revealed promising results for the prediction of coding intricacy by boosting the RF algorithm as compared to bagging, J48, sequential minimal optimization (SMO), multilayer perceptron (MLP), and Naive Bayes (NB). Logistic regression-based boosting (LogitBoost) and adaptive boosting (AdaBoost) are outperforming with 85.14% accuracy of prediction. The concerns leading towards high coding intricacy level can be resolved by discussing with peers and instructors. The proposed approach can ensure a responsible attitude among software engineering teams and drive towards fulfilling the goals of education for sustainable development by optimizing the learning environment.

[1]  Miikka Kuutila,et al.  Time Pressure in Software Engineering: A Systematic Literature Review , 2019, Information and Software Technology.

[2]  Yongqiang Sun,et al.  Understanding the determinants of learner engagement in MOOCs: An adaptive structuration perspective , 2020, Comput. Educ..

[3]  Mary Joy Pigozzi Quality in Education Defines ESD , 2007 .

[4]  Ruth Cobos,et al.  Improving learner engagement in MOOCs using a learning intervention system: A research study in engineering education , 2020, Comput. Appl. Eng. Educ..

[5]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Hatice Yildiz Durak,et al.  The Effects of Using Different Tools in Programming Teaching of Secondary School Students on Engagement, Computational Thinking and Reflective Thinking Skills for Problem Solving , 2020, Technol. Knowl. Learn..

[7]  Carlos Delgado Kloos,et al.  Temporal analysis for dropout prediction using self-regulated learning strategies in self-paced MOOCs , 2020, Comput. Educ..

[8]  Ying Cao,et al.  Advance and Prospects of AdaBoost Algorithm , 2013, ACTA AUTOMATICA SINICA.

[9]  Jill Denner,et al.  Computer games created by middle school girls: Can they be used to measure understanding of computer science concepts? , 2012, Comput. Educ..

[10]  S. Sathiya Keerthi,et al.  Improvements to the SMO algorithm for SVM regression , 2000, IEEE Trans. Neural Networks Learn. Syst..

[11]  Keith Topping Peer Tutoring and Cooperative Learning , 2020 .

[12]  Juan A. Gómez-Pulido,et al.  Analyzing and Predicting Students’ Performance by Means of Machine Learning: A Review , 2020, Applied Sciences.

[13]  Max Tegmark,et al.  The role of artificial intelligence in achieving the Sustainable Development Goals , 2019, Nature Communications.

[14]  F. Pala,et al.  The Effect of Algorithm Education on Students' Computer Programming Self-Efficacy Perceptions and Computational Thinking Skills , 2020, Int. J. Comput. Sci. Educ. Sch..

[15]  Han Cao,et al.  Predictive learning analytics using deep learning model in MOOCs’ courses videos , 2020, Education and Information Technologies.

[16]  Tadayoshi Fushiki,et al.  Estimation of prediction error by using K-fold cross-validation , 2011, Stat. Comput..

[17]  Neeraj Bhargava,et al.  Decision Tree Analysis on J48 Algorithm for Data Mining , 2013 .

[18]  Ya Zhou,et al.  Multi-Model Stacking Ensemble Learning for Dropout Prediction in MOOCs , 2020 .

[19]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[20]  Wenhao Zhu,et al.  Early Prediction of a Team Performance in the Initial Assessment Phases of a Software Project for Sustainable Software Engineering Education , 2020, Sustainability.

[21]  Hatice Yıldız Durak The Effects of Using Different Tools in Programming Teaching of Secondary School Students on Engagement, Computational Thinking and Reflective Thinking Skills for Problem Solving , 2018, Technology, Knowledge and Learning.

[22]  B. Pham,et al.  Evaluation and comparison of LogitBoost Ensemble, Fisher’s Linear Discriminant Analysis, logistic regression and support vector machines methods for landslide susceptibility mapping , 2019 .

[23]  Laurence Habib,et al.  The role of academic management in implementing technology-enhanced learning in higher education , 2020 .

[24]  Chen Qiao,et al.  What predicts student satisfaction with MOOCs: A gradient boosting trees supervised machine learning and sentiment analysis approach , 2020, Comput. Educ..

[25]  Kaisu Sammalisto,et al.  Connecting Competences and Pedagogical Approaches for Sustainable Development in Higher Education: A Literature Review and Framework Proposal , 2017 .

[26]  Xin Jiang,et al.  SCFH: A Student Analysis Model to Identify Students' Programming Levels in Online Judge Systems , 2020, Symmetry.

[28]  Jordi Colomer,et al.  Pre-Service Teachers’ Reflections on Cooperative Learning: Instructional Approaches and Identity Construction , 2019, Sustainability.

[29]  Jung-tsung Ho,et al.  Technology-enhanced learning in higher education: A bibliometric analysis with latent semantic approach , 2020, Comput. Hum. Behav..

[30]  Deanne Gannaway,et al.  Learner engagement in MOOCs: Scale development and validation , 2020, Br. J. Educ. Technol..

[31]  Yoram M. Kalman,et al.  What are the barriers to learners’ satisfaction in MOOCs and what predicts them? The role of age, intention, self-regulation, self-efficacy and motivation , 2020 .

[32]  Rebecca Saxe,et al.  Reduced neural selectivity for mental states in deaf children with delayed exposure to sign language , 2020, Nature Communications.

[33]  F. Rennie,et al.  A study of the relationship between students’ engagement and their academic performances in an eLearning environment , 2019, E-Learning and Digital Media.

[34]  Unhawa Ninrutsirikun,et al.  Principal Component Clustered Factors for Determining Study Performance in Computer Programming Class , 2020, Wirel. Pers. Commun..

[35]  Pratap Chandra Sen,et al.  Supervised Classification Algorithms in Machine Learning: A Survey and Review , 2019, Advances in Intelligent Systems and Computing.

[36]  Francisca Maria Ivone,et al.  Infusing Cooperative Learning in Distance Education. , 2020 .

[37]  Wan Zakiyatussariroh Wan Husin,et al.  Predicting Student Drop-Out in Higher Institution Using Data Mining Techniques , 2020, Journal of Physics: Conference Series.

[38]  Yu-Yin Hsu,et al.  Evaluating the effectiveness of a preservice teacher technology training module incorporating SQD strategies , 2020, International Journal of Educational Technology in Higher Education.

[39]  Lior Fink,et al.  It is about time: Bias and its mitigation in time-saving decisions in software development projects , 2020, International Journal of Project Management.

[40]  Najib Ali Mozahem,et al.  Using Learning Management System Activity Data to Predict Student Performance in Face-to-Face Courses , 2020, Int. J. Mob. Blended Learn..

[41]  Lior Rokach,et al.  Ensemble learning: A survey , 2018, WIREs Data Mining Knowl. Discov..

[42]  Ali Yahyaouy,et al.  A robust classification to predict learning styles in adaptive E-learning systems , 2019, Education and Information Technologies.

[43]  Shih-Yeh Chen,et al.  Design and Evaluation of a Deep Learning Recommendation Based Augmented Reality System for Teaching Programming and Computational Thinking , 2020, IEEE Access.

[44]  A. Viera,et al.  Understanding interobserver agreement: the kappa statistic. , 2005, Family medicine.

[45]  Margus Pedaste,et al.  Mining Educational Data to Predict Students’ Performance through Procrastination Behavior , 2019, Entropy.

[46]  Pedro J. Muñoz-Merino,et al.  Analysis of the Factors Influencing Learners’ Performance Prediction With Learning Analytics , 2020, IEEE Access.

[47]  K. Hew,et al.  Examining learning engagement in MOOCs: a self-determination theoretical perspective using mixed method , 2020, International Journal of Educational Technology in Higher Education.

[48]  Ana L. C. Bazzan,et al.  A comparative evaluation of aggregation methods for machine learning over vertically partitioned data , 2020, Expert Syst. Appl..

[49]  Nour-Eddine El Faddouli,et al.  BERT and Prerequisite Based Ontology for Predicting Learner’s Confusion in MOOCs Discussion Forums , 2020, AIED.

[50]  Yang Liu,et al.  A Novel Improvement Strategy of Competency for Education for Sustainable Development (ESD) of University Teachers Based on Data Mining , 2020, Sustainability.

[51]  José Augusto Baranauskas,et al.  How Many Trees in a Random Forest? , 2012, MLDM.

[52]  Jill C. Murray,et al.  Pragmatic content in EFL textbooks: an investigation into Vietnamese national teaching materials , 2020 .

[53]  Robert P. W. Duin,et al.  Bagging, Boosting and the Random Subspace Method for Linear Classifiers , 2002, Pattern Analysis & Applications.

[54]  Cristóbal Romero,et al.  Educational data mining and learning analytics: An updated survey , 2020, WIREs Data Mining Knowl. Discov..

[55]  Cary J. Roseth,et al.  The Cascading Effects of Reducing Student Stress: Cooperative Learning as a Means to Reduce Emotional Problems and Promote Academic Engagement , 2020, The Journal of early adolescence.