Co-Clustering by Bipartite Spectral Graph Partitioning for Out-of-Tutor Prediction

Learning a more distributed representation of the input feature space is a powerful method to boost the performance of a given predictor. Often this is accomplished by partitioning the data into homogeneous groups by clustering so that separate models could be trained on each cluster. Intuitively each such predictor is a better representative of the members of the given cluster than a predictor trained on the entire data-set. Previous work has used this basic premise to construct a simple yet strong bagging strategy. However, such models have one significant drawback: Instances (such as students) are clustered while features (tutor usage features/items) are left alone. One-way clustering by using some objective function measures the degree of homogeneity between data instances. Often it is noticed that features also influence final prediction in homogeneous groups. This indicates a duality in the relationship between clusters of instances and clusters of features. Co-Clustering simultaneously measures the degree of homogeneity in both data instances and features, thus also achieving clustering and dimensionality reduction simultaneously. Students and features could be modelled as a bipartite graph and a simultaneous clustering could be posed as a bipartite graph partitioning problem. In this paper we integrate an effective bagging strategy with Co-Clustering and present results for prediction of out-of-tutor performance of students. We report that such a strategy is very useful and intuitive, even improving upon performance achieved by previous work.

[1]  Zachary A. Pardos,et al.  Clustering Students to Generate an Ensemble to Improve Standard Test Score Predictions , 2011, AIED.

[2]  Daniel Hanisch,et al.  Co-clustering of biological networks and gene expression data , 2002, ISMB.

[3]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[4]  Ann L. Brown,et al.  Dynamic Assessment: One Approach and Some Initial Data. Technical Report No. 361. , 1985 .

[5]  Srujana Merugu,et al.  A scalable collaborative filtering framework based on co-clustering , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[6]  Inderjit S. Dhillon,et al.  A generalized maximum entropy approach to bregman co-clustering and matrix approximation , 2004, J. Mach. Learn. Res..

[7]  Michael William Newman,et al.  The Laplacian spectrum of graphs , 2001 .

[8]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[9]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[10]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[11]  T. Schack,et al.  Dynamic testing , 2003 .

[12]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[13]  Zachary A. Pardos,et al.  Clustered Knowledge Tracing , 2012, ITS.

[14]  J. Bransford How people learn , 2000 .

[15]  B. Mohar THE LAPLACIAN SPECTRUM OF GRAPHS y , 1991 .

[16]  Inderjit S. Dhillon,et al.  A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification , 2003, J. Mach. Learn. Res..

[17]  Ann L. Brown,et al.  How people learn: Brain, mind, experience, and school. , 1999 .

[18]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[19]  Ana I. González Acuña An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, Boosting, and Randomization , 2012 .

[20]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[21]  Neil T. Heffernan,et al.  Addressing the assessment challenge with an online system that tutors as it assesses , 2009, User Modeling and User-Adapted Interaction.

[22]  Zachary A. Pardos,et al.  Spectral Clustering in Educational Data Mining , 2011, EDM.

[23]  R. Felder,et al.  Understanding Student Differences , 2005 .

[24]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[25]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[26]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[27]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.