Diffusion Fingerprints

We introduce a new method for classifying and clustering data exhibiting associative properties. By means of graph theoretical tools, we show how to generatediffusion fingerprintsfor each subset of the data collection. We then propose a simple and computationally efficient technique for dimensionality reduction. Throughout this paper, we apply our method to the problem of classifying a corpus of text documents and compare it to other methods.