Signal Processing on Graphs: Modeling (Causal) Relations in Big Data

Many big data applications collect a large number of time series, for example, the financial data of companies quoted in a stock exchange, the health care data of all patients that visit the emergency room of a hospital, or the temperature sequences continuously measured by weather stations across the US. A first task in the analytics of these data is to derive a low dimensional representation, a graph or discrete manifold, that describes well the interrelations among the time series and their intrarelations across time. This paper presents a computationally tractable algorithm for estimating this graph structure from the available data. This graph is directed and weighted, possibly representing causation relations, not just correlations as in most existing approaches in the literature. The algorithm is demonstrated on random graph and real network time series datasets, and its performance is compared to that of related methods. The adjacency matrices estimated with the new method are close to the true graph in the simulated data and consistent with prior physical knowledge in the real dataset tested.

[1]  José M. F. Moura,et al.  Discrete Signal Processing on Graphs , 2012, IEEE Transactions on Signal Processing.

[2]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[3]  Michael I. Jordan,et al.  Learning graphical models for stationary time series , 2004, IEEE Transactions on Signal Processing.

[4]  Mário A. T. Figueiredo,et al.  Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems , 2007, IEEE Journal of Selected Topics in Signal Processing.

[5]  Robert D. Nowak,et al.  Causal Network Inference Via Group Sparse Regularization , 2011, IEEE Transactions on Signal Processing.

[6]  Pablo A. Parrilo,et al.  Latent variable graphical model selection via convex optimization , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[8]  José M. F. Moura,et al.  Big Data Analysis with Signal Processing on Graphs: Representation and processing of massive data sets with irregular structure , 2014, IEEE Signal Processing Magazine.

[9]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[10]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[11]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[12]  José M. F. Moura,et al.  Discrete Signal Processing on Graphs: Frequency Analysis , 2013, IEEE Transactions on Signal Processing.

[13]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[14]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[15]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[16]  Dimitri P. Bertsekas,et al.  Convex Analysis and Optimization , 2003 .

[17]  Lieven Vandenberghe,et al.  Topology Selection in Graphical Models of Autoregressive Processes , 2010, J. Mach. Learn. Res..