Functional Graphical Models

ABSTRACT Graphical models have attracted increasing attention in recent years, especially in settings involving high-dimensional data. In particular, Gaussian graphical models are used to model the conditional dependence structure among multiple Gaussian random variables. As a result of its computational efficiency, the graphical lasso (glasso) has become one of the most popular approaches for fitting high-dimensional graphical models. In this paper, we extend the graphical models concept to model the conditional dependence structure among p random functions. In this setting, not only is p large, but each function is itself a high-dimensional object, posing an additional level of statistical and computational complexity. We develop an extension of the glasso criterion (fglasso), which estimates the functional graphical model by imposing a block sparsity constraint on the precision matrix, via a group lasso penalty. The fglasso criterion can be optimized using an efficient block coordinate descent algorithm. We establish the concentration inequalities of the estimates, which guarantee the desirable graph support recovery property, that is, with probability tending to one, the fglasso will correctly identify the true conditional dependence structure. Finally, we show that the fglasso significantly outperforms possible competing methods through both simulations and an analysis of a real-world electroencephalography dataset comparing alcoholic and nonalcoholic patients.

[1]  D. Bosq Linear Processes in Function Spaces: Theory And Applications , 2000 .

[2]  G. Knyazev Motivation, emotion, and their inhibitory control mirrored in brain oscillations , 2007, Neuroscience & Biobehavioral Reviews.

[3]  Trevor J. Hastie,et al.  Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso , 2011, J. Mach. Learn. Res..

[4]  Jeng-Min Chiou,et al.  Multivariate functional principal component analysis: A normalization approach , 2014 .

[5]  Han Liu,et al.  Joint estimation of multiple graphical models from high dimensional time series , 2013, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[6]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[7]  Bernice Porjesz,et al.  Patterns of regional brain activity in alcohol-dependent subjects. , 2006, Alcoholism, clinical and experimental research.

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  Xiaotong Shen,et al.  Structural Pursuit Over Multiple Undirected Graphs , 2014, Journal of the American Statistical Association.

[10]  Denis Bosq,et al.  Linear Processes in Function Spaces , 2000 .

[11]  B. Silverman,et al.  Estimating the mean and covariance structure nonparametrically when the data are curves , 1991 .

[12]  E. Levina,et al.  Joint estimation of multiple graphical models. , 2011, Biometrika.

[13]  Lester Ingber,et al.  Statistical mechanics of neocortical interactions : Canonical momenta indicators of electroencephalography , 1995 .

[14]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[15]  Larry A. Wasserman,et al.  Time varying undirected graphs , 2008, Machine Learning.

[16]  T. Cai,et al.  A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.

[17]  O. Faugeras Three-dimensional computer vision: a geometric viewpoint , 1993 .

[18]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[19]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[20]  Paul Pritchard On Computing the Subset Graph of a Collection of Sets , 1999, J. Algorithms.

[21]  Catherine A. Sugar,et al.  Principal component models for sparse functional data , 1999 .

[22]  Jianqing Fan,et al.  Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation. , 2007, Annals of statistics.

[23]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[24]  Trevor J. Hastie,et al.  The Graphical Lasso: New Insights and Alternatives , 2011, Electronic journal of statistics.

[25]  Grünwald,et al.  Model Selection Based on Minimum Description Length. , 2000, Journal of mathematical psychology.

[26]  Wolfgang Förstner,et al.  Generic Estimation Procedures for Orientation with Minimum and Redundant Information , 2001 .

[27]  N. Altman,et al.  On dimension folding of matrix- or array-valued statistical objects , 2010, 1002.4789.

[28]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[29]  Pei Wang,et al.  Partial Correlation Estimation by Joint Sparse Regression Models , 2008, Journal of the American Statistical Association.

[30]  R. Tibshirani,et al.  A note on the group lasso and a sparse group lasso , 2010, 1001.0736.

[31]  J. Friedman,et al.  New Insights and Faster Computations for the Graphical Lasso , 2011 .

[32]  T. Hsing,et al.  Theoretical foundations of functional data analysis, with an introduction to linear operators , 2015 .

[33]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[34]  Hsing,et al.  Functional Data Analysis , 2015 .

[35]  Panos Y. Papalambros,et al.  A Hypergraph Framework for Optimal Model-Based Decomposition of Design Problems , 1997, Comput. Optim. Appl..

[36]  H. Müller,et al.  Functional Data Analysis for Sparse Longitudinal Data , 2005 .

[37]  Lexin Li,et al.  Regularized matrix regression , 2012, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[38]  Patrick Danaher,et al.  The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[39]  Hung Hung,et al.  Matrix variate logistic regression model with application to EEG data. , 2011, Biostatistics.

[40]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[41]  Cun-Hui Zhang,et al.  Sparse matrix inversion with scaled Lasso , 2012, J. Mach. Learn. Res..

[42]  W. Föstner Reliability analysis of parameter estimation in linear models with application to mensuration problems in computer vision , 1987 .

[43]  Alberto Sanfeliu,et al.  Distance between Attributed Graphs and Function-Described Graphs Relaxing 2nd Order Restrictions , 2000, SSPR/SPR.

[44]  Eric P. Xing,et al.  On Time Varying Undirected Graphs , 2011, AISTATS.

[45]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[46]  H. Begleiter,et al.  Event related potentials during object recognition tasks , 1995, Brain Research Bulletin.

[47]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[48]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[49]  David B. Dunson,et al.  Bayesian Graphical Models for Multivariate Functional Data , 2014, J. Mach. Learn. Res..

[50]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[51]  H. Zou,et al.  Sparse precision matrix estimation via lasso penalized D-trace loss , 2014 .

[52]  Hiroshi Murase,et al.  Parametric Feature Detection , 1996, International Journal of Computer Vision.

[53]  John D. Storey,et al.  Significance analysis of time course microarray experiments. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[54]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .