Writing Proficiency Assessment: Regression Analysis of Item Response Theory supported by Machine Learning Techniques

A subject's ability to express himself demonstrates his ability to understand reality. Text production is a way to verify the proficiency of such a skill. This correlation can help in the teaching-learning process since the learning diagnosis depends on the identification of possible instructional gaps, which subsidize the composition of better teaching strategies. In this article, we present an approach to characterizing learning profiles and estimating grades in the assessment of writing tests. For that, we used item response theory and machine learning techniques in the dataset of test scores of the Exame Nacional do Ensino Médio carried out in 2019. The results show that using a portion of only 2k training instances of the 3; 7M instances and only one of the five competencies evaluated, it is possible to have a correct prediction of the skill with a p-value 0:06 and pearson correlation of 0:94. Our approach shows the benefits of employing such techniques in a real-world scenario.