On Poisson Graphical Models

Undirected graphical models, such as Gaussian graphical models, Ising, and multinomial/categorical graphical models, are widely used in a variety of applications for modeling distributions over a large number of variables. These standard instances, however, are ill-suited to modeling count data, which are increasingly ubiquitous in big-data settings such as genomic sequencing data, user-ratings data, spatial incidence data, climate studies, and site visits. Existing classes of Poisson graphical models, which arise as the joint distributions that correspond to Poisson distributed node-conditional distributions, have a major drawback: they can only model negative conditional dependencies for reasons of normalizability given its infinite domain. In this paper, our objective is to modify the Poisson graphical model distribution so that it can capture a rich dependence structure between count-valued variables. We begin by discussing two strategies for truncating the Poisson distribution and show that only one of these leads to a valid joint distribution. While this model can accommodate a wider range of conditional dependencies, some limitations still remain. To address this, we investigate two additional novel variants of the Poisson distribution and their corresponding joint graphical model distributions. Our three novel approaches provide classes of Poisson-like graphical models that can capture both positive and negative conditional dependencies between count-valued variables. One can learn the graph structure of our models via penalized neighborhood selection, and we demonstrate the performance of our methods by learning simulated networks as well as a network from microRNA-sequencing data.

[1]  Larry A. Wasserman,et al.  The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs , 2009, J. Mach. Learn. Res..

[2]  Genevera I. Allen,et al.  A Log-Linear Graphical Model for inferring genetic networks from high-throughput sequencing data , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine.

[3]  Robert A. Weinberg,et al.  Therapeutic silencing of miR-10b inhibits metastasis in a mouse mammary tumor model , 2010, Nature Biotechnology.

[4]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the lasso , 2007, 0708.3517.

[5]  M. Stephens,et al.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. , 2008, Genome research.

[6]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[7]  Larry A. Wasserman,et al.  High Dimensional Semiparametric Gaussian Copula Graphical Models. , 2012, ICML 2012.

[8]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumours , 2013 .

[9]  D. Karlis An EM algorithm for multivariate Poisson distribution and related models , 2003 .

[10]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[11]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[12]  C. Croce,et al.  Epigenetically deregulated microRNA-375 is involved in a positive feedback loop with estrogen receptor alpha in breast cancer cells. , 2010, Cancer research.

[13]  J. Lafferty,et al.  High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[14]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[15]  Pradeep Ravikumar,et al.  Graphical Models via Generalized Linear Models , 2012, NIPS.

[16]  Noel A Cressie,et al.  Modeling Poisson variables with positive spatial dependence , 1997 .

[17]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[18]  Galit Shmueli,et al.  An Elegant Method for Generating Multivariate Poisson Random Variables , 2007 .

[19]  Larry A. Wasserman,et al.  The Nonparanormal SKEPTIC , 2012, ICML 2012.

[20]  P. Holgate Estimation for the bivariate Poisson distribution , 1964 .

[21]  Pradeep Ravikumar,et al.  Graphical models via univariate exponential family distributions , 2013, J. Mach. Learn. Res..

[22]  Ali Jalali,et al.  On Learning Discrete Graphical Models using Group-Sparse Regularization , 2011, AISTATS.

[23]  A. Dobra,et al.  Copula Gaussian graphical models and their application to modeling functional disability data , 2011, 1108.1680.

[24]  Larry A. Wasserman,et al.  Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models , 2010, NIPS.

[25]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[26]  Daniel A. Griffith,et al.  A spatial filtering specification for the auto-Poisson model , 2002 .