论文信息 - Learning Tree Structures from Noisy Data

Learning Tree Structures from Noisy Data

We provide high-probability sample complexity guarantees for exact structure recovery of tree-structured graphical models, when only noisy observations of the respective vertex emissions are available. We assume that the hidden variables follow either an Ising model or a Gaussian graphical model, and the observables are noise-corrupted versions of the hidden variables: We consider multiplicative $\pm 1$ binary noise for Ising models, and additive Gaussian noise for Gaussian models. Such hidden models arise naturally in a variety of applications such as physics, biology, computer science, and finance. We study the impact of measurement noise on the task of learning the underlying tree structure via the well-known \textit{Chow-Liu algorithm} and provide formal sample complexity guarantees for exact recovery. In particular, for a tree with $p$ vertices and probability of failure $\delta>0$, we show that the number of necessary samples for exact structure recovery is of the order of $\mc{O}(\log(p/\delta))$ for Ising models (which remains the \textit{same as in the noiseless case}), and $\mc{O}(\mathrm{polylog}{(p/\delta)})$ for Gaussian models.

Anand D. Sarwate | Dionysios S. Kalogerias | Konstantinos E. Nikolakakis

[1] David E. Tyler,et al. Robust estimators for nondecomposable elliptical graphical models , 2013, 1302.5251.

[2] Tetsuya Takaishi,et al. Multiple Time Series Ising Model for Financial Market Simulations , 2015, 1611.08088.

[3] G. Bennett. Probability Inequalities for the Sum of Independent Random Variables , 1962 .

[4] Roger G. Melko,et al. Deep Learning the Ising Model Near Criticality , 2017, J. Mach. Learn. Res..

[5] Seyed Abolfazl Motahari,et al. Learning of Tree-Structured Gaussian Graphical Models on Distributed Data Under Communication Constraints , 2019, IEEE Transactions on Signal Processing.

[6] Vincent Y. F. Tan,et al. Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates , 2010, J. Mach. Learn. Res..

[7] Qiang Ji,et al. A Coupled Hidden Markov Random Field model for simultaneous face clustering and tracking in videos , 2017, Pattern Recognit..

[8] R. Douc,et al. CONSISTENCY OF THE MAXIMUM LIKELIHOOD ESTIMATOR FOR GENERAL HIDDEN MARKOV MODELS , 2009, 0912.4480.

[9] Guy Bresler,et al. Efficiently Learning Ising Models on Arbitrary Graphs , 2014, STOC.

[10] Chuan Li,et al. Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Larry Wasserman,et al. Forest Density Estimation , 2010, J. Mach. Learn. Res..

[12] T. Bossomaier,et al. Information flow in a kinetic Ising model peaks in the disordered phase. , 2013, Physical review letters.

[13] Guosheng Lin,et al. Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Xiaoxiao Li,et al. Deep Learning Markov Random Field for Semantic Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Ali Jalali,et al. On Learning Discrete Graphical Models using Greedy Methods , 2011, NIPS.

[16] Aapo Hyvärinen,et al. Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[17] Guy Bresler,et al. Learning a Tree-Structured Ising Model in Order to Make Predictions , 2016, The Annals of Statistics.

[18] Bin Wang,et al. Learning Trans-Dimensional Random Fields with Applications to Language Modeling , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Ramon van Handel,et al. Observability and nonlinear filtering , 2007, 0708.3412.

[20] A. Jazwinski. Stochastic Processes and Filtering Theory , 1970 .

[21] Alexandre B. Tsybakov,et al. Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[22] Raquel Urtasun,et al. Fully Connected Deep Structured Networks , 2015, ArXiv.

[23] Cun-Hui Zhang,et al. Sparse matrix inversion with scaled Lasso , 2012, J. Mach. Learn. Res..

[24] Vincent Y. F. Tan,et al. Learning Latent Tree Graphical Models , 2010, J. Mach. Learn. Res..

[25] Maxim Raginsky,et al. Strong Data Processing Inequalities and $\Phi $ -Sobolev Inequalities for Discrete Channels , 2014, IEEE Transactions on Information Theory.

[26] Trevor Hastie,et al. Applications of the lasso and grouped lasso to the estimation of sparse graphical models , 2010 .

[27] Percy Liang,et al. Estimating Latent-Variable Graphical Models using Moments and Likelihoods , 2014, ICML.

[28] Athina P. Petropulu,et al. Grid Based Nonlinear Filtering Revisited: Recursive Estimation & Asymptotic Optimality , 2016, IEEE Transactions on Signal Processing.

[29] Martin Bilodeau. Graphical lassos for meta-elliptical distributions , 2014 .

[30] Alexandre d'Aspremont,et al. Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[31] Elchanan Mossel,et al. Reconstruction of Markov Random Fields from Samples: Some Observations and Algorithms , 2007, SIAM J. Comput..

[32] Cynthia Dwork,et al. Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[33] Vincent Y. F. Tan,et al. Learning Gaussian Tree Models: Analysis of Error Exponents and Extremal Structures , 2009, IEEE Transactions on Signal Processing.

[34] Didier Sornette,et al. Self-organizing Ising model of financial markets , 2005, physics/0503230.

[35] David Edwards,et al. Selecting high-dimensional mixed graphical models using minimal AIC or BIC forests , 2010, BMC Bioinformatics.

[36] Yuehua Wu,et al. TUNING PARAMETER SELECTION FOR PENALIZED LIKELIHOOD ESTIMATION OF GAUSSIAN GRAPHICAL MODEL , 2012 .

[37] Mikhail Prokopenko,et al. Criticality and Information Dynamics in Epidemiological Models , 2017, Entropy.

[38] Akira Sasaki,et al. Statistical Mechanics of Population: The Lattice Lotka-Volterra Model , 1992 .

[39] J. Lafferty,et al. High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[40] Martin J. Wainwright,et al. Tree-reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching , 2003, AISTATS.

[41] Maxim Sviridenko,et al. Concentration and moment inequalities for polynomials of independent random variables , 2012, SODA.

[42] Larry A. Wasserman,et al. The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs , 2009, J. Mach. Learn. Res..

[43] A. Willsky,et al. Latent variable graphical model selection via convex optimization , 2010 .

[44] M. Drton,et al. Estimation of High-Dimensional Graphical Models Using Regularized Score Matching. , 2015, Electronic journal of statistics.

[45] C. N. Liu,et al. Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[46] Jonathan Le Roux,et al. Deep unfolding for multichannel source separation , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[47] Mathias Drton,et al. High-dimensional Ising model selection with Bayesian information criteria , 2014, 1403.3374.

[48] S. Fortunato,et al. Statistical physics of social dynamics , 2007, 0710.3256.

[49] S. Torquato. Toward an Ising model of cancer and beyond , 2010, Physical biology.

[50] Ruijiang Li,et al. Incorporating prior biological knowledge for network-based differential gene expression analysis using differentially weighted graphical LASSO , 2017, BMC Bioinformatics.

[51] L. Isserlis. ON A FORMULA FOR THE PRODUCT-MOMENT COEFFICIENT OF ANY ORDER OF A NORMAL FREQUENCY DISTRIBUTION IN ANY NUMBER OF VARIABLES , 1918 .

[52] Sanjay Shakkottai,et al. Improved Greedy Algorithms for Learning Graphical Models , 2015, IEEE Transactions on Information Theory.

[53] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[54] Yihong Wu,et al. Strong data-processing inequalities for channels and Bayesian networks , 2015, 1508.06025.

[55] Martin J. Wainwright,et al. Information-Theoretic Limits of Selecting Binary Graphical Models in High Dimensions , 2009, IEEE Transactions on Information Theory.

[56] Neil J. Gordon,et al. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[57] M. Yuan,et al. Model selection and estimation in the Gaussian graphical model , 2007 .

[58] Marloes H. Maathuis,et al. Structure Learning in Graphical Modeling , 2016, 1606.02359.

[59] Anima Anandkumar,et al. Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[60] Paris Smaragdis,et al. Single channel source separation using smooth Nonnegative Matrix Factorization with Markov Random Fields , 2013, 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[61] Trevor J. Hastie,et al. Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso , 2011, J. Mach. Learn. Res..

[62] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .

[63] Rina Foygel,et al. Extended Bayesian Information Criteria for Gaussian Graphical Models , 2010, NIPS.

[64] Hao Wang,et al. Bayesian Graphical Lasso Models and Efficient Posterior Computation , 2012 .

[65] Anima Anandkumar,et al. Learning Loopy Graphical Models with Latent Variables: Efficient Methods and Guarantees , 2012, The Annals of Statistics.

[66] Alain Hauser,et al. High-dimensional consistency in score-based and hybrid structure learning , 2015, The Annals of Statistics.

[67] R. Tibshirani,et al. Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[68] Trevor J. Hastie,et al. The Graphical Lasso: New Insights and Alternatives , 2011, Electronic journal of statistics.

[69] Mathias Drton,et al. Robust graphical modeling of gene networks using classical and alternative t-distributions , 2010, 1009.3669.

[70] Shiqian Ma,et al. Alternating Direction Methods for Latent Variable Gaussian Graphical Model Selection , 2012, Neural Computation.

[71] H. Zou,et al. High dimensional semiparametric latent graphical model for mixed data , 2014, 1404.7236.

[72] Moni Naor,et al. Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[73] David R. Karger,et al. Learning Markov networks: maximum bounded tree-width graphs , 2001, SODA '01.

[74] D. Vogel,et al. Elliptical graphical modelling , 2011, 1506.04321.

[75] Patrick Danaher,et al. The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[76] Michael Banf,et al. Enhancing gene regulatory network inference through data integration with markov random fields , 2017, Scientific Reports.

[77] Aapo Hyvärinen,et al. Some extensions of score matching , 2007, Comput. Stat. Data Anal..