论文信息 - High-Dimensional Gaussian Graphical Model Selection: Tractable Graph Families

High-Dimensional Gaussian Graphical Model Selection: Tractable Graph Families

We consider the problem of high-dimensional Gaussian graphical model selection. We identify a set of graphs for which an efficient estimation algorithm exists, and this algorithm is based on thresholding of empirical conditional covariances. Under a set of transparent conditions, we establish structural consistency (or sparsistency) for the proposed algorithm, when the number of samples n = ω(J min log p), where p is the number of variables and Jmin is the minimum (absolute) edge potential of the graphical model. The sufficient conditions for sparsistency are based on the notion of walk-summability of the model and the presence of sparse local vertex separators in the underlying graph. We also derive novel non-asymptotic necessary conditions on the number of samples required for sparsistency.

[1] C. N. Liu,et al. Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[2] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[3] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[4] Christopher Meek,et al. Learning Bayesian Networks with Discrete Variables from Data , 1995, KDD.

[5] Jung-Fu Cheng,et al. Turbo Decoding as an Instance of Pearl's "Belief Propagation" Algorithm , 1998, IEEE J. Sel. Areas Commun..

[6] David A. Bell,et al. Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[7] Duncan J. Watts,et al. Collective dynamics of ‘small-world’ networks , 1998, Nature.

[8] Michael I. Jordan. Graphical Models , 1998 .

[9] Michael I. Jordan,et al. Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[10] Yair Weiss,et al. Correctness of Local Probability Propagation in Graphical Models with Loops , 2000, Neural Computation.

[11] William T. Freeman,et al. Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology , 1999, Neural Computation.

[12] David R. Karger,et al. Learning Markov networks: maximum bounded tree-width graphs , 2001, SODA '01.

[13] Vijay V. Vazirani,et al. Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[14] Benny Sudakov,et al. The Largest Eigenvalue of Sparse Random Graphs , 2001, Combinatorics, Probability and Computing.

[15] Devavrat Shah,et al. Maximum weight matching via max-product belief propagation , 2005, Proceedings. International Symposium on Information Theory, 2005. ISIT 2005..

[16] Leonhard Held,et al. Gaussian Markov Random Fields: Theory and Applications , 2005 .

[17] A. Grabowski,et al. Ising-based model of opinion formation in a complex network of interpersonal interactions , 2006 .

[18] Jianhua Z. Huang,et al. Covariance matrix selection and estimation via penalised normal likelihood , 2006 .

[19] Alan M. Frieze,et al. Random graphs , 2006, SODA '06.

[20] N. Meinshausen,et al. High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[21] F. Chung,et al. Complex Graphs and Networks , 2006 .

[22] Dmitry M. Malioutov,et al. Walk-Sums and Belief Propagation in Gaussian Graphical Models , 2006, J. Mach. Learn. Res..

[23] Pieter Abbeel,et al. Learning Factor Graphs in Polynomial Time and Sample Complexity , 2006, J. Mach. Learn. Res..

[24] A. Grabowskia,et al. Ising-based model of opinion formation in a complex network of interpersonal interactions , 2006 .

[25] Hilbert J. Kappen,et al. Sufficient Conditions for Convergence of the Sum–Product Algorithm , 2005, IEEE Transactions on Information Theory.

[26] Elchanan Mossel,et al. The Complexity of Distinguishing Markov Random Fields , 2008, APPROX-RANDOM.

[27] M. Bayati,et al. Max-Product for Maximum Weight Matching: Convergence, Correctness, and LP Duality , 2008, IEEE Transactions on Information Theory.

[28] Bin Yu,et al. High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[29] Alexandre d'Aspremont,et al. First-Order Methods for Sparse Covariance Selection , 2006, SIAM J. Matrix Anal. Appl..

[30] Elchanan Mossel,et al. Reconstruction of Markov Random Fields from Samples: Some Observations and Algorithms , 2007, SIAM J. Comput..

[31] Venkat Chandrasekaran,et al. Estimation in Gaussian Graphical Models Using Tractable Subgraphs: A Walk-Sum Analysis , 2008, IEEE Transactions on Signal Processing.

[32] Young-Han Kim,et al. State Amplification , 2008, IEEE Transactions on Information Theory.

[33] Adam J. Rothman,et al. Sparse permutation invariant covariance estimation , 2008, 0801.4837.

[34] N. Ruozzi,et al. Graph covers and quadratic minimization , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[35] Lang Tong,et al. A large-deviation analysis for the maximum likelihood learning of tree structures , 2009, 2009 IEEE International Symposium on Information Theory.

[36] Devavrat Shah,et al. Message Passing for Maximum Weight Independent Set , 2008, IEEE Transactions on Information Theory.

[37] Jianqing Fan,et al. Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation. , 2007, Annals of statistics.

[38] Riccardo Zecchina,et al. A rigorous analysis of the cavity equations for the minimum spanning tree , 2009, ArXiv.

[39] P. Bickel,et al. Covariance regularization by thresholding , 2009, 0901.3079.

[40] Devavrat Shah,et al. Belief Propagation for Min-Cost Network Flow: Convergence and Correctness , 2010, Oper. Res..

[41] Antonio Torralba,et al. Exploiting hierarchical context on a large database of object categories , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42] Larry Wasserman,et al. Forest Density Estimation , 2010, J. Mach. Learn. Res..

[43] Martin J. Wainwright,et al. Information-theoretic bounds on model selection for Gaussian Markov random fields , 2010, 2010 IEEE International Symposium on Information Theory.

[44] Benjamin Van Roy,et al. Convergence of Min-Sum Message-Passing for Convex Optimization , 2010, IEEE Transactions on Information Theory.

[45] Vincent Y. F. Tan,et al. Learning Gaussian Tree Models: Analysis of Error Exponents and Extremal Structures , 2009, IEEE Transactions on Signal Processing.

[46] Sanjay Shakkottai,et al. Greedy learning of Markov network structure , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[47] A. Dembo,et al. Ising models on locally tree-like graphs , 2008, 0804.4726.

[48] R. Hofstad,et al. Ising Models on Power-Law Random Graphs , 2010, 1005.4556.

[49] Venkat Chandrasekaran,et al. Feedback message passing for inference in gaussian graphical models , 2010, 2010 IEEE International Symposium on Information Theory.

[50] Convergent and Correct Message Passing Schemes for Optimization Problems over Graphical Models , 2010, UAI.

[51] Vincent Y. F. Tan,et al. Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates , 2010, J. Mach. Learn. Res..

[52] Avinatan Hassidim,et al. Topology discovery of sparse random graphs with few participants , 2011, SIGMETRICS '11.

[53] Vincent Y. F. Tan,et al. High-Dimensional Structure Estimation in Ising Models: Tractable Graph Families , 2011, ArXiv.

[54] Vincent Y. F. Tan,et al. Learning Latent Tree Graphical Models , 2010, J. Mach. Learn. Res..

[55] Ioana Dumitriu,et al. Sparse regular random graphs: Spectral density and eigenvectors , 2009, 0910.5306.

[56] Y F TanVincent,et al. High-dimensional Gaussian graphical model selection , 2012 .

[57] Pascal O. Vontobel,et al. Counting in Graph Covers: A Combinatorial Characterization of the Bethe Entropy Function , 2010, IEEE Transactions on Information Theory.

[58] U. Feige,et al. Spectral Graph Theory , 2015 .