Clustering on subspaces and sparse representation of signals

In many practical problems the data X under consideration (given as (m /spl times/ N)-matrix) is of the form X = AS, where the matrices A and S with dimensions m /spl times/ n and n /spl times/ N respectively (often called mixing matrix or dictionary and source matrix) are unknown (m /spl les/ n < N). Under some conditions, we can recover A and S uniquely (up to scaling and permutation), such that S is r-sparse in sense that each column of S has at most m - r nonzero elements. In this paper we consider the case r /spl ges/ 2 and develop an algorithm for clustering over subspaces, which is essential for identification of the mixing matrix A. The idea of this clustering is the same as in the k-mean clustering problem, but instead of balls, here we cluster oven subspaces with co-dimension r. The problem is to find subspaces with co-dimension r such that the sum of the distances from given data points to them is minimal. For identification of the source matrix, we apply a special source recovery algorithm. We illustrate our algorithms with an example. We note that our method is quite general, since the sparseness conditions could be obtained with some preprocessing methods and no independence conditions for the source signals are imposed (in contrast to independent component analysis).