Educational Data Mining: Identification of factors associated with school effectiveness in PISA assessment

Abstract With the main goal of identifying the process factors associated with school effectiveness in secondary education, this work presents an innovative methodological proposal. Based on secondary data from the Spanish sample of PISA 2015, high- and low-effectiveness schools were selected by analysing the residuals of the school level (level 2) in multilevel models. Subsequently, decision trees were used to identify the process variables with a greater predictive power for the identification of high- and low-effectiveness schools. While the hierarchical linear models obtained show inter-school variance scores greater than 10%, decision trees achieve a precision greater than 90%. We conclude by analysing the suitability of using decision trees in data originating from large-scale assessments, and examining the obtained factors associated with school effectiveness.

[1]  Nicola F. Kirby,et al.  Using Decision Tree Analysis to Understand Foundation Science Student Performance. Insight Gained at One South African University , 2014 .

[2]  L. Angus The Sociology of School Effectiveness , 1993 .

[3]  Antoni Verger,et al.  The growth and spread of large-scale assessments and test-based accountabilities: a political sociology of global education reforms , 2018, Educational Review.

[4]  Dominik Petko,et al.  Perceived Quality of Educational Technology Matters , 2017 .

[5]  H. Hill The Coleman Report, 50 Years On: What Do We Know about the Role of Schools in Academic Inequality? , 2017 .

[6]  Y. Cheng,et al.  School autonomy, leadership and learning: a reconceptualisation , 2016 .

[7]  Kevin Casey,et al.  Utilizing student activity patterns to predict performance , 2017, International Journal of Educational Technology in Higher Education.

[8]  S. Kuger,et al.  Increased instruction hours and the widening gap in student performance , 2017 .

[9]  A. Fernández-Cano Una crítica metodológica a las evaluaciones PISA , 2016 .

[10]  Stephen Alstrup,et al.  High-School Dropout Prediction Using Machine Learning: A Danish Large-scale Study. , 2015 .

[11]  P. Róbert,et al.  School Choice in the Light of the Effectiveness Differences of Various Types of Public and Private Schools in 19 OECD Countries , 2008 .

[12]  J. Martínez,et al.  Determinantes del riesgo de fracaso escolar en España en PISA-2009 y propuestas de reforma , 2013 .

[13]  H. Goldstein Multilevel Statistical Models , 2006 .

[14]  David A Chambers,et al.  Big Data and Large Sample Size: A Cautionary Note on the Potential for Bias , 2014, Clinical and Translational Science.

[15]  Danhui Zhang,et al.  A multilevel analysis of the effects of disciplinary climate strength on student reading performance , 2018 .

[16]  Ersoy Öz,et al.  Identifying the Classification Performances of Educational Data Mining Methods: A Case Study for TIMSS , 2017 .

[17]  U. Bronfenbrenner The Ecology of Human Development: Experiments by Nature and , 1979 .

[18]  Cheng Yong Tan,et al.  Information technology, mathematics achievement and educational equity in developed economies , 2017 .

[19]  Patrícia Costa,et al.  Skilled Students and Effective Schools: Reading Achievement in Denmark, Sweden, and France , 2018 .

[20]  J. Scheerens Process indicators of school functioning: A selection based on the research literature on school effectiveness , 1991 .

[21]  Patrícia Costa,et al.  Can low skill teachers make good students? Empirical evidence from PIAAC and PISA , 2015 .

[22]  John Jerrim,et al.  To weight or not to weight?: the case of PISA data , 2017 .

[23]  Xiufeng Liu,et al.  Using Data Mining to Predict K-12 Students' Performance on Large-Scale Assessment Items Related to Energy. , 2008 .

[24]  A. Gamoran,et al.  Equality of Educational Opportunity A 40 Year Retrospective , 2007 .

[25]  Sam Stringfield,et al.  Outlier Studies of School Effectiveness , 1994 .

[26]  S. Han School-based teacher hiring and achievement inequality: A comparative perspective , 2018, International Journal of Educational Development.

[27]  Peter D. Turney Technical note: Bias and the quantification of stability , 1995, Machine Learning.

[28]  Fernando Martínez Abad,et al.  Data-mining techniques in detecting factors linked to academic achievement , 2017 .

[29]  M. Luque,et al.  Balancing Teachers’ Math Satisfaction and Other Indicators of the Education System’s Performance , 2016 .

[30]  M. V. Alderete,et al.  Acceso a las TIC y rendimiento educativo: ¿una relación potenciada por su uso? Un análisis para España , 2017 .

[31]  Danhui Zhang,et al.  How Does ICT Use Influence Students' Achievements in Math and Science over Time? Evidence from PISA 2000 to 2012. , 2016 .

[32]  Elvira Carpintero Molina,et al.  ¿Cuánto oro hay entre la arena? Minería de datos con los resultados de España en PISA 2015 , 2018 .

[33]  Hüseyin Gürüler,et al.  A new student performance analysing system using knowledge discovery in higher educational databases , 2010, Comput. Educ..

[34]  Seiji Isotani,et al.  Educational Data Mining: A review of evaluation process in the e-learning , 2018, Telematics Informatics.

[35]  D. Caro,et al.  Performance status and change – measuring education system effectiveness with data from PISA 2000–2009 , 2014 .

[36]  F. Martínez-Abad Identification of Factors Associated With School Effectiveness With Data Mining Techniques: Testing a New Approach , 2019, Front. Psychol..

[37]  Mohammed J. Zaki,et al.  Predicting Math Performance from Raw Large-Scale Educational Assessments Data : A Machine Learning Approach , 2016 .

[38]  Parental involvement and pupil reading achievement in Ireland: Findings from PIRLS 2011 , 2015 .

[39]  Understanding School Effects in South Africa Using Multilevel Analysis: Findings from TIMSS 2011. , 2015 .

[40]  David Kaplan,et al.  The Methodology of PISA: Past, Present, and Future , 2016 .

[41]  K. Choi,et al.  A comparative investigation of the presence of psychological conditions in high achieving eighth graders from TIMSS 2007 Mathematics , 2012 .

[42]  Jaan Mikk,et al.  Relationships Between Student Perception of Teacher-Student Relations and PISA Results in Mathematics and Science , 2016 .

[43]  Josip Burušić,et al.  School Effectiveness: An Overview of Conceptual, Methodological and Empirical Foundations , 2016 .

[44]  Jui-Long Hung,et al.  Integrating Data Mining in Program Evaluation of K-12 Online Education , 2012, J. Educ. Technol. Soc..

[45]  Elvira Carpintero Molina,et al.  How much gold is in the sand? Data mining with Spain’s PISA 2015 results , 2018 .

[46]  Sebastián Ventura,et al.  Predicting students' final performance from participation in on-line discussion forums , 2013, Comput. Educ..

[47]  Dennis Niemann,et al.  PISA and Its Consequences: Shaping Education Policies through International Comparisons. , 2017 .

[48]  Martina R. M. Meelissen,et al.  The contribution of TIMSS to the link between school and classroom factors and student achievement , 2013 .

[49]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[50]  Mingjie Tan,et al.  Prediction of Student Dropout in E-Learning Program Through the Use of Machine Learning Method , 2015, Int. J. Emerg. Technol. Learn..

[51]  Dmitri Rozgonjuk,et al.  To what extent does Internet use affect academic performance? Using Evidence from the large-scale PISA study. , 2017 .

[52]  P. Banerjee,et al.  A systematic review of factors linked to poor academic performance of disadvantaged students in science and maths in schools , 2016 .

[53]  Javier Murillo School Effectiveness Research in Latin America , 2007 .

[54]  L. Kyriakides,et al.  Using a multidimensional approach to measure the impact of classroom-level factors upon student achievement: a study testing the validity of the dynamic model , 2008 .

[55]  James Sebastian,et al.  The relationship of school-based parental involvement with student achievement: a comparison of principal and parent survey reports from PISA 2012 , 2017 .

[56]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[57]  Jingjing Zhang,et al.  How the ICT development level and usage influence student achievement in reading, mathematics, and science , 2015, Comput. Educ..

[58]  Melinda Whitford,et al.  Opportunities-to-Learn at Home: Profiles of Students With and Without Reaching Science Proficiency , 2011 .

[59]  Marshall S. Smith,et al.  Effective Schools: A Review , 1983, The Elementary School Journal.

[60]  Jo-Anne Baird,et al.  Lessons Learned from PISA: A Systematic Review of Peer-Reviewed Articles on the Programme for International Student Assessment , 2018 .

[61]  J. Teodorović Student background factors influencing student achievement in Serbia , 2012 .

[62]  Lawrence W. Lezotte School improvement based on the effective schools research , 1989 .

[63]  Baldoino Fonseca dos Santos Neto,et al.  Evaluating the effectiveness of educational data mining techniques for early prediction of students' academic failure in introductory programming courses , 2017, Comput. Hum. Behav..