Unsupervised data reduction

We propose a data reduction method based on fuzzy clustering and nonnegative matrix factorisation. In contrast to different variants of data set editing typically used for data reduction, our method is completely unsupervised, i.e., it does not need class labels to eliminate examples from a data set. Thus, it is useful in exploratory data analysis when class labels of examples are unknown or unavailable in order to gain insight into structure of different groups of patterns. Also unlike many types of unsupervised clustering relating a single example (cluster centroid) to each cluster, our method associates a set of the most representative examples with each cluster. Hence, it makes cluster structure more transparent to a data analyst.