Towards Conditional Independence Test for Relational Data

Conditional independence (CI) tests play a central role in statistical inference, machine learning, and causal discovery. Most existing CI tests assume that the samples are independently and identically distributed (i.i.d.). However, this assumption often does not hold in the case of relational data. We define Relational Conditional Independence (RCI), a generalization of CI to the relational setting. We show how, under a set of structural assumptions, we can test for RCI by reducing the task of testing for RCI on non-i.i.d. data to the problem of testing for CI on several data sets each of which consists of i.i.d. samples. We develop Kernel Relational CI test (KRCIT), a nonparametric test as a practical approach to testing for RCI by relaxing the structural assumptions used in our analysis of RCI. We describe results of experiments with synthetic relational data that show the benefits of KRCIT relative to traditional CI tests that don’t account for the non-i.i.d. nature of relational data.

[1]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[2]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[3]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[4]  Ben Taskar,et al.  Probabilistic Relational Models , 2014, Encyclopedia of Social Network Analysis and Mining.

[5]  David Heckerman,et al.  Probabilistic Entity-Relationship Models, PRMs, and Plate Models , 2004 .

[6]  A. Dawid Conditional Independence in Statistical Theory , 1979 .

[7]  Vasant Honavar,et al.  Self-Discrepancy Conditional Independence Test , 2017, UAI.

[8]  Bernhard Schölkopf,et al.  Kernel Measures of Conditional Dependence , 2007, NIPS.

[9]  Arthur Gretton,et al.  A Wild Bootstrap for Degenerate Kernel Tests , 2014, NIPS.

[10]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[11]  Alexander J. Smola,et al.  Gaussian Processes for Independence Tests with Non-iid Data in Causal Inference , 2015, ACM Trans. Intell. Syst. Technol..

[12]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[13]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[14]  Katerina Marazopoulou,et al.  A Sound and Complete Algorithm for Learning Causal Models from Relational Data , 2013, UAI.

[15]  Le Song,et al.  Kernel Measures of Independence for non-iid Data , 2008, NIPS.

[16]  Bernhard Schölkopf,et al.  A Permutation-Based Kernel Conditional Independence Test , 2014, UAI.

[17]  Laurian M. Chirica,et al.  The entity-relationship model: toward a unified view of data , 1975, SIGF.

[18]  Bernhard Schölkopf,et al.  Kernel-based Conditional Independence Test and Application in Causal Discovery , 2011, UAI.

[19]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[20]  Vasant Honavar,et al.  A Characterization of Markov Equivalence Classes of Relational Causal Models under Path Semantics , 2016, UAI.

[21]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[22]  Nils M. Kriege,et al.  On Valid Optimal Assignment Kernels and Applications to Graph Classification , 2016, NIPS.