Sampling large data on graphs

We consider the problem of sampling from data defined on the nodes of a weighted graph, where the edge weights capture the data correlation structure. As shown recently, using spectral graph theory one can define a cut-off frequency for the bandlimited graph signals that can be reconstructed from a given set of samples (i.e., graph nodes). In this work, we show how this cut-off frequency can be computed exactly. Using this characterization, we provide efficient algorithms for finding the subset of nodes of a given size with the largest cut-off frequency and for finding the smallest subset of nodes with a given cut-off frequency. In addition, we study the performance of random uniform sampling when compared to the centralized optimal sampling provided by the proposed algorithms.

[1]  Pascal Frossard,et al.  The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[2]  Michael G. Rabbat,et al.  Graph spectral compressed sensing for sensor networks , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Antonio Ortega,et al.  Towards a sampling theorem for signals on arbitrary graphs , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Peter A. Flach,et al.  A Fast Method for Property Prediction in Graph-Structured Data from Positive and Unlabelled Examples , 2008, ECAI.

[5]  James Bennett,et al.  The Netflix Prize , 2007 .

[6]  Sunil K. Narang,et al.  Downsampling graphs using spectral theory , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Sunil K. Narang,et al.  Signal processing techniques for interpolation in graph structured data , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Marcus Weber,et al.  Robust Perron Cluster Analysis for Various Applications in Computational Life Science , 2005, CompLife.

[9]  I. Pesenson Sampling in paley-wiener spaces on combinatorial graphs , 2008, 1111.5896.

[10]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[11]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.