Mutual conditional independence and its applications to model selection in Markov networks

The fundamental concepts underlying Markov networks are the conditional independence and the set of rules called Markov properties that translate conditional independence constraints into graphs. We introduce the concept of mutual conditional independence in an independent set of a Markov network, and we prove its equivalence to the Markov properties under certain regularity conditions. This extends the notion of similarity between separation in graph and conditional independence in probability to similarity between the mutual separation in graph and the mutual conditional independence in probability. Model selection in graphical models remains a challenging task due to the large search space. We show that mutual conditional independence property can be exploited to reduce the search space. We present a new forward model selection algorithm for graphical log-linear models using mutual conditional independence. We illustrate our algorithm with a real data set example. We show that for sparse models the size of the search space can be reduced from O(n3)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\mathcal {O} (n^{3})$\end{document} to O(n2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\mathcal {O}(n^{2})$\end{document} using our proposed forward selection method rather than the classical forward selection method. We also envision that this property can be leveraged for model selection and inference in different types of graphical models.

[1]  F. Matús On equivalence of Markov properties over undirected graphs , 1992, Journal of Applied Probability.

[2]  P. Bühlmann,et al.  Decomposition and Model Selection for Large Contingency Tables , 2009, Biometrical journal. Biometrische Zeitschrift.

[3]  S. J. Press,et al.  Review: Yvonne M. M. Bishop, Stephen E. Fienberg and Paul W. Holland, Discrete multivariate analysis: Theory and practice , 1978 .

[4]  Alan Agresti,et al.  Categorical Data Analysis , 2003 .

[5]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[6]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[7]  Niharika Gauraha,et al.  Graphical Log-linear Models: Fundamental Concepts and Applications , 2016, 1603.04122.

[8]  A. Cohen On random fields , 1967 .

[9]  R. Forthofer,et al.  Multidimensional Contingency Tables , 1981 .

[10]  J. N. R. Jeffers,et al.  Graphical Models in Applied Multivariate Statistics. , 1990 .

[11]  P. Sen Multidimensional Contingency Tables , 2006 .

[12]  J. Pearl,et al.  Logical and Algorithmic Properties of Conditional Independence and Graphical Models , 1993 .

[13]  Jessika Weiss,et al.  Graphical Models In Applied Multivariate Statistics , 2016 .

[14]  A. Dawid Conditional Independence in Statistical Theory , 1979 .

[15]  Ronald Christensen,et al.  Log-Linear Models and Logistic Regression , 1997 .

[16]  L. A. Goodman The Analysis of Multidimensional Contingency Tables: Stepwise Procedures and Direct Estimation Methods for Building Models for Multiple Classifications , 1971 .

[17]  Geoffrey I. Webb,et al.  A Multiple Test Correction for Streams and Cascades of Statistical Hypothesis Tests , 2016, KDD.

[18]  F. Matús On conditional independence and log-convexity , 2012 .

[19]  N. Wermuth Model Search among Multiplicative Models , 1976 .

[20]  Michael I. Jordan Graphical Models , 2003 .