Detection Theory for Graphs

For many Lincoln Laboratory mission areas, relationships between entities of interest are important. Over the years, much work has gone into the detection of entities, such as radio emissions, vehicles in images, or people named in documents. The examination of relationships between entities (Which emitters are co-located? Which vehicles stopped at the same place? What people are mentioned in the same documents?) can provide a significant improvement in situational awareness and may allow analysts to find subtle, coordinated activity that would go undetected if the relational components of the data were not considered. In a cyber security application , for example, the volume of network traffic may not be high on any particular node, but if there are notably higher rates of communication between a small subset of nodes where the traffic is usually more diffuse, some suspicious activity may be occurring. While the entities alone may not present any detectable anomalous behavior, the relationships and interactions may indicate the presence of interesting activity. Graphs provide a natural representation for relational data. A graph G = (V, E) is a pair of sets: a set of vertices, V, that denote the entities and a set of edges, E, that represent relationships or connections between the entities. Graphs have been used, implicitly or explicitly, for hundreds of years to represent sets of objects that are somehow connected, such as points on possible travel routes, nodes in electrical circuits, or interacting particles. More recently, graphs have gained popularity in the modeling of relational data, such as social and computer networks. In the last two decades, the explosion in new data-collec-Graphs are fast emerging as a common data structure used in many scientific and engineering fields. While a wide variety of techniques exist to analyze graph datasets, practitioners currently lack a signal processing theory akin to that of detection and estimation in the classical setting of vector spaces with Gaussian noise. Using practical detection examples involving large, random " background " graphs and noisy real-world datasets, the authors present a novel graph analytics framework that allows for uncued analysis of very large datasets. This framework combines traditional computer science techniques with signal processing in the context of graph data, creating a new research area at the intersection of the two fields. Each data observation considered in this article is in the form of a graph G = (V, E). Each vertex in V …

[1]  Robin Wilson,et al.  Modern Graph Theory , 2013 .

[2]  Jeremy Kepner,et al.  Dynamic distributed dimensional data model (D4M) database and computation system , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Benjamin A. Miller,et al.  Toward matched filter optimization for subgraph detection in dynamic networks , 2012, 2012 IEEE Statistical Signal Processing Workshop (SSP).

[4]  Brian Delaney,et al.  Modeling and detection techniques for Counter-Terror Social Network Analysis and Intent Recognition , 2009, 2009 IEEE Aerospace conference.

[5]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[6]  Paul Erdös,et al.  On random graphs, I , 1959 .

[7]  Patrick J. Wolfe,et al.  Toward signal processing theory for graphs and non-Euclidean data , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Patrick J. Wolfe,et al.  Subgraph Detection Using Eigenvector L1 Norms , 2010, NIPS.

[9]  Jeremy Kepner,et al.  A scalable signal processing architecture for massive graph analysis , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Benjamin A. Miller,et al.  Benchmarking parallel eigen decomposition for residuals analysis of very large graphs , 2012, 2012 IEEE Conference on High Performance Extreme Computing.

[11]  Patrick J. Wolfe,et al.  Moments of parameter estimates for Chung-Lu random graph models , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, NIPS 2004.

[13]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[14]  Benjamin A. Miller,et al.  Goodness-of-fit statistics for anomaly detection in Chung-Lu random graphs , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Steven Kay,et al.  Fundamentals Of Statistical Signal Processing , 2001 .

[16]  P. Wolfe,et al.  Anomalous subgraph detection via Sparse Principal Component Analysis , 2011, 2011 IEEE Statistical Signal Processing Workshop (SSP).

[17]  Y. Lacasse,et al.  From the authors , 2005, European Respiratory Journal.

[18]  B. A. Miller,et al.  Matched filtering for subgraph detection in dynamic networks , 2011, 2011 IEEE Statistical Signal Processing Workshop (SSP).

[19]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Christos Faloutsos,et al.  R-MAT: A Recursive Model for Graph Mining , 2004, SDM.

[21]  Mark Newman,et al.  Detecting community structure in networks , 2004 .

[22]  Benjamin A. Miller,et al.  Eigenspace analysis for threat detection in social networks , 2011, 14th International Conference on Information Fusion.

[23]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[24]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[25]  Steven Thomas Smith,et al.  Bayesian Discovery of Threat Networks , 2013, IEEE Transactions on Signal Processing.