Detecting Localized Categorical Attributes on Graphs

Do users from Carnegie Mellon University form social communities on Facebook? Do signal processing researchers tightly collaborate with each other? Do Chinese restaurants in Manhattan cluster together? These seemingly different problems share a common structure: an attribute that may be localized on a graph. In other words, nodes activated by an attribute form a subgraph that can be easily separated from other nodes. In this paper, we thus focus on the task of detecting localized attributes on a graph. We are particularly interested in categorical attributes such as attributes in online social networks, ratings in recommender systems, and viruses in cyber-physical systems because they are widely used in numerous data mining applications. To solve the task, we formulate a statistical hypothesis testing problem to decide whether a given attribute is localized or not. We propose two statistics: Graph wavelet statistic and graph scan statistic, both of which are provably effective in detecting localized attributes. We validate the robustness of the proposed statistics on both simulated data and two real-world applications: High air-pollution detection and keyword ranking in a coauthorship network collected from IEEE Xplore. Experimental results show that the proposed graph wavelet statistic and graph scan statistic are effective and efficient.

[1]  D. O. North,et al.  An Analysis of the factors which determine signal/noise discrimination in pulsed-carrier systems , 1963 .

[2]  D. Casasent,et al.  Minimum average correlation energy filters. , 1987, Applied optics.

[3]  Jerome L. Myers,et al.  Research Design and Statistical Analysis , 1991 .

[4]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[5]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[6]  Ian Witten,et al.  Data Mining , 2000 .

[7]  M. Ledoux The concentration of measure phenomenon , 2001 .

[8]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[11]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[12]  P. Massart,et al.  Concentration inequalities and model selection , 2007 .

[13]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[14]  Robert D. Nowak,et al.  Distilled sensing: selective sampling for sparse signal recovery , 2009, AISTATS.

[15]  Soummya Kar,et al.  Distributed Consensus Algorithms in Sensor Networks With Imperfect Communication: Link Failures and Channel Noise , 2007, IEEE Transactions on Signal Processing.

[16]  Pierre Vandergheynst,et al.  Wavelets on Graphs via Spectral Graph Theory , 2009, ArXiv.

[17]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[18]  Richard Baraniuk,et al.  Recovery of Clustered Sparse Signals from Compressive Measurements , 2009 .

[19]  A. Robert Calderbank,et al.  Detecting Weak but Hierarchically-Structured Patterns in Networks , 2010, AISTATS.

[20]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[21]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[22]  Sunil K. Narang,et al.  Unidirectional graph-based wavelet transforms for efficient data gathering in sensor networks , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[23]  Biming Tian,et al.  Anomaly detection in wireless sensor networks: A survey , 2011, J. Netw. Comput. Appl..

[24]  E. Candès,et al.  Detection of an anomalous cluster in a network , 2010, 1001.3209.

[25]  Sunil K. Narang,et al.  Perfect Reconstruction Two-Channel Wavelet Filter Banks for Graph Structured Data , 2011, IEEE Transactions on Signal Processing.

[26]  Michael G. Rabbat,et al.  Approximating signals supported on graphs , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  Alessandro Rinaldo,et al.  Sparsistency of the Edge Lasso over Graphs , 2012, AISTATS.

[28]  Quanzheng Li,et al.  Matched Signal Detection on Graphs: Theory and Application to Brain Network Classification , 2013, IPMI.

[29]  Sunil K. Narang,et al.  Compact Support Biorthogonal Wavelet Filterbanks for Arbitrary Undirected Graphs , 2012, IEEE Transactions on Signal Processing.

[30]  Alessandro Rinaldo,et al.  Changepoint Detection over Graphs with the Spectral Scan Statistic , 2012, AISTATS.

[31]  Akshay Krishnamurthy,et al.  Near-optimal Anomaly Detection in Graphs using Lovasz Extended Scan Statistic , 2013, NIPS.

[32]  Akshay Krishnamurthy,et al.  Detecting Activations over Graphs using Spanning Tree Wavelet Bases , 2012, AISTATS.

[33]  Pascal Frossard,et al.  The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[34]  José M. F. Moura,et al.  Discrete Signal Processing on Graphs , 2012, IEEE Transactions on Signal Processing.

[35]  Pascal Frossard,et al.  Clustering on Multi-Layer Graphs via Subspace Analysis on Grassmann Manifolds , 2013, IEEE Transactions on Signal Processing.

[36]  Sunil K. Narang,et al.  Signal processing techniques for interpolation in graph structured data , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[37]  Jure Leskovec,et al.  Community Detection in Networks with Node Attributes , 2013, 2013 IEEE 13th International Conference on Data Mining.

[38]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[39]  Yue M. Lu,et al.  A Spectral Graph Uncertainty Principle , 2012, IEEE Transactions on Information Theory.

[40]  Vivek K Goyal,et al.  Foundations of Signal Processing , 2014 .

[41]  Alfred O. Hero,et al.  Local Fiedler vector centrality for detection of deep and overlapping communities in networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[42]  H. Vincent Poor,et al.  Distributed Hybrid Power State Estimation Under PMU Sampling Phase Errors , 2014, IEEE Transactions on Signal Processing.

[43]  Elva O'Sullivan Learn About Air , 2014 .

[44]  Pascal Frossard,et al.  Learning Parametric Dictionaries for Signals on Graphs , 2014, IEEE Transactions on Signal Processing.

[45]  Danai Koutra,et al.  Graph based anomaly detection and description: a survey , 2014, Data Mining and Knowledge Discovery.

[46]  Pierre Borgnat,et al.  Graph Wavelets for Multiscale Community Mining , 2014, IEEE Transactions on Signal Processing.

[47]  José M. F. Moura,et al.  Signal denoising on graphs via graph filtering , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[48]  Yan-lin Zhang,et al.  Fine particulate matter (PM2.5) in China at a city level , 2015, Scientific Reports.

[49]  José M. F. Moura,et al.  Signal Recovery on Graphs: Variation Minimization , 2014, IEEE Transactions on Signal Processing.

[50]  Jelena Kovacevic,et al.  Signal Representations on Graphs: Tools and Applications , 2015, ArXiv.

[51]  H. Vincent Poor,et al.  Quickest detection of Gauss-Markov random fields , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[52]  Akshay Krishnamurthy Minimaxity in Structured Normal Means Inference , 2015, ArXiv.

[53]  Yuan Zhang,et al.  Community Detection in Networks with Node Features , 2015, Electronic Journal of Statistics.

[54]  Kannan Ramchandran,et al.  Spline-Like Wavelet Filterbanks for Multiresolution Analysis of Graph-Structured Data , 2015, IEEE Transactions on Signal and Information Processing over Networks.

[55]  Pengfei Liu,et al.  Local-Set-Based Graph Signal Reconstruction , 2014, IEEE Transactions on Signal Processing.

[56]  Jelena Kovacevic,et al.  Discrete Signal Processing on Graphs: Sampling Theory , 2015, IEEE Transactions on Signal Processing.

[57]  Antonio Ortega,et al.  Submitted to Ieee Transactions on Signal Processing 1 Efficient Sampling Set Selection for Bandlimited Graph Signals Using Graph Spectral Proxies , 2022 .

[58]  Antonio Ortega,et al.  Bipartite subgraph decomposition for critically sampled wavelet filterbanks on arbitrary graphs , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[59]  Wilhelm Burger,et al.  Digital Image Processing - An Algorithmic Introduction using Java , 2008, Texts in Computer Science.

[60]  Jelena Kovacevic,et al.  Signal Recovery on Graphs: Fundamental Limits of Sampling Strategies , 2015, IEEE Transactions on Signal and Information Processing over Networks.

[61]  Pierre Vandergheynst,et al.  A Multiscale Pyramid Transform for Graph Signals , 2013, IEEE Transactions on Signal Processing.

[62]  Santiago Segarra,et al.  Sampling of Graph Signals With Successive Local Aggregations , 2015, IEEE Transactions on Signal Processing.

[63]  H. Vincent Poor,et al.  Nonparametric detection of an anomalous disk over a two-dimensional lattice network , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[64]  Michelle Wilde IEEE Xplore Digital Library , 2016 .

[65]  Sergio Barbarossa,et al.  Signals on Graphs: Uncertainty Principle and Sampling , 2015, IEEE Transactions on Signal Processing.

[66]  Akshay Krishnamurthy,et al.  Minimax structured normal means inference , 2015, 2016 IEEE International Symposium on Information Theory (ISIT).

[67]  Alexander J. Smola,et al.  Trend Filtering on Graphs , 2014, J. Mach. Learn. Res..

[68]  Santiago Segarra,et al.  Stationary graph processes: Nonparametric spectral estimation , 2016, 2016 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM).

[69]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[70]  Georgios B. Giannakis,et al.  Kernel-Based Reconstruction of Graph Signals , 2016, IEEE Transactions on Signal Processing.

[71]  Pierre Vandergheynst,et al.  Stationary Signal Processing on Graphs , 2016, IEEE Transactions on Signal Processing.

[72]  Santiago Segarra,et al.  Stationary Graph Processes and Spectral Estimation , 2016, IEEE Transactions on Signal Processing.

[73]  Pier Luigi Dragotti,et al.  Sampling and Reconstruction of Sparse Signals on Circulant Graphs - An Introduction to Graph-FRI , 2016, ArXiv.