Analyzing Students’ Knowledge Building Skills by Comparing Their Written Production to Syllabus

The field of Learning Analytics (LA) focuses on the collection, measure and analysis of data about learners and their contexts. LA benefits from tools that are normally rooted in probabilistic/frequency-based approaches, which are themselves incapable of capturing the meaning of texts at any level because probabilities do not constitute a natural language semantics. As alternatives to these approaches, Natural Language Processing (NLP) techniques allow the integration of semantic aspects into the analysis. In this study, we aim to evaluate the coverage of the syllabus vocabulary in students’ documents using a method based on linguistic and cognitive knowledge. Our analysis is conducted using an asymmetric coverage hybrid measure, which combines semantic and lexical information with cognitive principles to determine how syllabus concepts are covered in students’ documents. To determine whether the concepts of a book are covered by the paragraphs in the students’ documents, we implemented a paragraph-to-document alignment strategy. This approach distinguishes between stronger and weaker productions by measuring the degree of concepts coverage between students’ papers and multiple sections of the syllabus.