Mistakes in UML Diagrams: Analysis of Student Projects in a Software Engineering Course

The Unified Modelling Language (UML) is being widely accepted as a modelling notation for visualizing software systems during design and development. UML has thus become part of many software engineering course curricula at universities worldwide, providing a recognized tool for practical training of students in understanding and visualizing software design. It is however common that students have difficulties in absorbing UML in its complexity, and often repeat the same mistakes that have been observed by course tutors in previous years. Having a catalogue of such mistakes could hence increase the effectiveness of both teaching and learning of UML diagrams. In this paper, we introduce such a catalogue, consisting of 146 types of mistakes in eight types of diagrams. As the main contribution of this study, we use this catalogue to guide the analysis of student projects within a software engineering course. In total, over 2,700 diagrams submitted over 12 weeks of a semester by 123 students were analysed to identify the frequency of mistakes (from the catalogue), correlations of the mistakes between different diagram types, correlation of the quality of student projects to exam results, student behaviour in terms of introducing and fixing the mistakes over time, and other interesting insights. The analysis is described together with its setup and execution, and all datasets and detailed guidebook to the catalogue of all mistakes is made available for download.