Recovering Probability Distributions from Missing Data

A probabilistic query may not be estimable from observed data corrupted by missing values if the data are not missing at random (MAR). It is therefore of theoretical interest and practical importance to determine in principle whether a probabilistic query is estimable from missing data or not when the data are not MAR. We present algorithms that systematically determine whether the joint probability distribution or a target marginal distribution is estimable from observed data with missing values, assuming that the data-generation model is represented as a Bayesian network, known as m-graphs, that not only encodes the dependencies among the variables but also explicitly portrays the mechanisms responsible for the missingness process. The results significantly advance the existing work.

[1]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[2]  Judea Pearl,et al.  Missing Data as a Causal and Probabilistic Problem , 2015, UAI.

[3]  Manfred Jaeger The AI&M Procedure for Learning from Incomplete Data , 2006, UAI.

[4]  Judea Pearl,et al.  Graphical Models for Recovering Probabilistic and Causal Queries from Missing Data , 2014, NIPS.

[5]  Guy Van den Broeck,et al.  Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete Data , 2014, UAI.

[6]  Richard S. Zemel,et al.  Recommender Systems, Missing Data and Statistical Model Estimation , 2011, IJCAI.

[7]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[8]  Jin Tian,et al.  A general identification condition for causal effects , 2002, AAAI/IAAI.

[9]  Jin Tian,et al.  On the Testable Implications of Causal Models with Hidden Variables , 2002, UAI.

[10]  Jin Tian,et al.  Graphical Models for Inference with Missing Data , 2013, NIPS.

[11]  Jin Tian,et al.  Missing at Random in Graphical Models , 2015, AISTATS.

[12]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[13]  Richard S. Zemel,et al.  Collaborative Filtering and the Missing at Random Assumption , 2007, UAI.

[14]  J. Robins,et al.  Towards A Complete Identification Algorithm for Missing Data Problems , 2016 .

[15]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[16]  Judea Pearl,et al.  On the Testability of Models with Missing Data , 2014, AISTATS.