Identification In Missing Data Models Represented By Directed Acyclic Graphs

Missing data is a pervasive problem in data analyses, resulting in datasets that contain censored realizations of a target distribution. Many approaches to inference on the target distribution using censored observed data, rely on missing data models represented as a factorization with respect to a directed acyclic graph. In this paper we consider the identifiability of the target distribution within this class of models, and show that the most general identification strategies proposed so far retain a significant gap in that they fail to identify a wide class of identifiable distributions. To address this gap, we propose a new algorithm that significantly generalizes the types of manipulations used in the ID algorithm [14, 16], developed in the context of causal inference, in order to obtain identification.

[1]  Judea Pearl,et al.  Graphical Models for Recovering Probabilistic and Causal Queries from Missing Data , 2014, NIPS.

[2]  Marco Valtorta,et al.  Pearl's Calculus of Intervention Is Complete , 2006, UAI.

[3]  Jin Tian,et al.  Graphical Models for Inference with Missing Data , 2013, NIPS.

[4]  BaoLuo Sun,et al.  Statistica Sinica Preprint No : SS-2016-0325 . R 2 Title Discrete Choice Models for Nonmonotone Nonignorable Missing Data : Identification and Inference , 2017 .

[5]  J. Kalbfleisch,et al.  Block-Conditional Missing at Random Models for Missing Data , 2010, 1104.2400.

[6]  Ilya Shpitser,et al.  Consistent Estimation of Functions of Data Missing Non-Monotonically and Not at Random , 2016, NIPS.

[7]  A. Tsiatis Semiparametric Theory and Missing Data , 2006 .

[8]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[9]  J M Robins,et al.  Non-response models for the analysis of non-monotone non-ignorable missing data. , 1997, Statistics in medicine.

[10]  J. Robins A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , 1986 .

[11]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[12]  James M. Robins,et al.  Nested Markov Properties for Acyclic Directed Mixed Graphs , 2012, UAI.

[13]  Jin Tian,et al.  A general identification condition for causal effects , 2002, AAAI/IAAI.

[14]  Jerome P. Reiter,et al.  Itemwise conditionally independent nonresponse modeling for incomplete multivariate data , 2016, 1609.00656.

[15]  Judea Pearl,et al.  Identification of Joint Interventional Distributions in Recursive Semi-Markovian Causal Models , 2006, AAAI.

[16]  R D Gill,et al.  Non-response models for the analysis of non-monotone ignorable missing data. , 1997, Statistics in medicine.

[17]  Judea Pearl,et al.  Missing Data as a Causal and Probabilistic Problem , 2015, UAI.