Ordinal Graphical Models: A Tale of Two Approaches

Undirected graphical models or Markov random fields (MRFs) are widely used for modeling multivariate probability distributions. Much of the work on MRFs has focused on continuous variables, and nominal variables (that is, unordered categorical variables). However, data from many real world applications involve ordered categorical variables also known as ordinal variables, e.g., movie ratings on Netflix which can be ordered from 1 to 5 stars. With respect to univariate ordinal distributions, as we detail in the paper, there are two main categories of distributions; while there have been efforts to extend these to multivariate ordinal distributions, the resulting distributions are typically very complex, with either a large number of parameters, or with non-convex likelihoods. While there have been some work on tractable approximations, these do not come with strong statistical guarantees, and moreover are relatively computationally expensive. In this paper, we theoretically investigate two classes of graphical models for ordinal data, corresponding to the two main categories of univariate ordinal distributions. In contrast to previous work, our theoretical developments allow us to provide correspondingly two classes of estimators that are not only computationally efficient but also have strong statistical guarantees.

[1]  A. R. de Leon Pairwise likelihood approach to grouped continuous model and its extension , 2005 .

[2]  N. Wermuth,et al.  Graphical Models for Associations between Variables, some of which are Qualitative and some Quantitative , 1989 .

[3]  Ming Yuan,et al.  High Dimensional Inverse Covariance Matrix Estimation via Linear Programming , 2010, J. Mach. Learn. Res..

[4]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[5]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[6]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[7]  B. Muthén A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators , 1984 .

[8]  Pradeep Ravikumar,et al.  Graphical models via univariate exponential family distributions , 2013, J. Mach. Learn. Res..

[9]  Pradeep Ravikumar,et al.  Vector-Space Markov Random Fields via Exponential Families , 2015, ICML.

[10]  J. Ashford,et al.  Multi-variate probit analysis. , 1970, Biometrics.

[11]  Takeshi Amemiya,et al.  Bivariate Probit Analysis: Minimum Chi-Square Methods , 1974 .

[12]  Ali Jalali,et al.  On Learning Discrete Graphical Models using Group-Sparse Regularization , 2011, AISTATS.

[13]  T. Cai,et al.  A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.

[14]  F. Harrell,et al.  Partial Proportional Odds Models for Ordinal Response Variables , 1990 .

[15]  S. Chib,et al.  Analysis of multivariate probit models , 1998 .

[16]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[17]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[18]  Francesco Bartolucci,et al.  An extended class of marginal link functions for modelling contingency tables by equality and inequality constraints , 2007 .

[19]  Pradeep Ravikumar,et al.  Graphical Models via Generalized Linear Models , 2012, NIPS.

[20]  B. Armstrong,et al.  Ordinal regression models for epidemiologic data. , 1989, American journal of epidemiology.

[21]  Genevera I. Allen,et al.  A Log-Linear Graphical Model for inferring genetic networks from high-throughput sequencing data , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine.

[22]  A. Montanari,et al.  The landscape of empirical risk for nonconvex losses , 2016, The Annals of Statistics.

[23]  Martin A. Nowak,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004 .

[24]  Wei Pan,et al.  A composite likelihood approach to latent multivariate Gaussian modeling of SNP data with application to genetic association testing. , 2012, Biometrics.

[25]  T. Speed,et al.  Gaussian Markov Distributions over Finite Graphs , 1986 .

[26]  Larry A. Wasserman,et al.  The Nonparanormal SKEPTIC , 2012, ICML 2012.

[27]  A. Agresti,et al.  Analysis of Ordinal Categorical Data. , 1985 .

[28]  Carlo Gaetan,et al.  Composite likelihood methods for space-time data , 2006 .

[29]  Pradeep Ravikumar,et al.  Mixed Graphical Models via Exponential Families , 2014, AISTATS.

[30]  Karl G. Jöreskog,et al.  On the estimation of polychoric correlations and their asymptotic covariance matrix , 1994 .

[31]  Peter E. Kennedy,et al.  A Graphical Exposition of the Ordered Probit , 1992, Econometric Theory.

[32]  George Michailidis,et al.  Graphical Models for Ordinal Data , 2015, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[33]  E. Ising Beitrag zur Theorie des Ferromagnetismus , 1925 .