Accounting for missing actors in interaction network inference from abundance data

Network inference aims at unraveling the dependency structure relating jointly observed variables. Graphical models provide a general framework to distinguish between marginal and conditional dependency. Unobserved variables (missing actors) may induce apparent conditional this http URL the context of count data, we introduce a mixture of Poisson log-normal distributions with tree-shaped graphical models, to recover the dependency structure, including missing actors. We design a variational EM algorithm and assess its performance on synthetic data. We demonstrate the ability of our approach to recover environmental drivers on two ecological datasets. The corresponding R package is available from this http URL.

[1]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[2]  Michael I. Jordan,et al.  Learning with Mixtures of Trees , 2001, J. Mach. Learn. Res..

[3]  Daniel J. Kleitman,et al.  Matrix Tree Theorems , 1978, J. Comb. Theory A.

[4]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[5]  David A. Pope,et al.  Multiple precision arithmetic , 1960, CACM.

[6]  Francis K. C. Hui,et al.  Untangling direct species associations from indirect mediator species effects with graphical models , 2019, Methods in Ecology and Evolution.

[7]  Alfred O. Hero,et al.  Learning Latent Variable Gaussian Graphical Models , 2014, ICML.

[8]  Stéphane Robin,et al.  Variational inference for probabilistic Poisson PCA , 2017, The Annals of Applied Statistics.

[9]  Tommi S. Jaakkola,et al.  Tractable Bayesian learning of tree belief networks , 2000, Stat. Comput..

[10]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[11]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[12]  J. Aitchison,et al.  The multivariate Poisson-log normal distribution , 1989 .

[13]  Larry A. Wasserman,et al.  The huge Package for High-dimensional Undirected Graph Estimation in R , 2012, J. Mach. Learn. Res..

[14]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[15]  Michael I. Jordan Graphical Models , 2003 .

[16]  Michaela Aschan,et al.  Fish assemblages in the Barents Sea , 2006 .

[17]  Anne-Béatrice Dufour,et al.  The ade4 Package: Implementing the Duality Diagram for Ecologists , 2007 .

[18]  Pradeep Ravikumar,et al.  A review of multivariate distributions for count data derived from the Poisson distribution , 2016, Wiley interdisciplinary reviews. Computational statistics.

[19]  Sergey Kirshner,et al.  Learning with Tree-Averaged Densities and Distributions , 2007, NIPS.

[20]  Emmanuel J. Candès,et al.  Discussion: Latent variable graphical model selection via convex optimization , 2012, ArXiv.

[21]  John Peebles,et al.  Sampling random spanning trees faster than matrix multiplication , 2016, STOC.

[22]  Christophe Ambroise,et al.  Inferring sparse Gaussian graphical models with latent structure , 2008, 0810.3177.

[23]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[24]  Francis K. C. Hui,et al.  A general algorithm for covariance modeling of discrete data , 2018, J. Multivar. Anal..

[25]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[26]  Christophe Ambroise,et al.  Tree-based Inference of Species Interaction Network from Abundance Data. , 2019 .

[27]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[28]  J. Shao Linear Model Selection by Cross-validation , 1993 .

[29]  Francis K. C. Hui,et al.  So Many Variables: Joint Modeling in Community Ecology. , 2015, Trends in ecology & evolution.

[30]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[31]  Steven L. Brunton,et al.  Sparse Principal Component Analysis via Variable Projection , 2018, SIAM J. Appl. Math..

[32]  Harry Joe,et al.  Composite Likelihood Methods , 2012 .

[33]  Stéphane Robin,et al.  Bayesian Inference of Graphical Model Structures Using Trees , 2015 .

[34]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[35]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[36]  Christophe Ambroise,et al.  Incomplete graphical model inference via latent tree aggregation , 2017, Statistical Modelling.

[37]  S. Robin,et al.  Exact Bayesian inference for off-line change-point detection in tree-structured graphical models , 2016, Stat. Comput..

[38]  Stéphane Robin,et al.  Variational Inference for sparse network reconstruction from count data , 2018, ICML.

[39]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .