Feature engineering for clustering student solutions

Open-ended homework problems such as coding assignments give students a broad range of freedom for the design of solutions. We aim to use the diversity in correct solutions to enhance student learning by automatically suggesting alternate solutions. Our approach is to perform a two-level hierarchical clustering of student solutions to first partition them based on the choice of algorithm and then partition solutions implementing the same algorithm based on low-level implementation details. Our initial investigations in domains of introductory programming and computer architecture demonstrate that we need two different classes of features to perform effective clustering at the two levels, namely abstract features and concrete features.