Interventions and Counterfactuals in Tractable Probabilistic Models: Limitations of Contemporary Transformations

In recent years, there has been an increasing interest in studying causality-related properties in machine learning models generally, and in generative models in particular. While that is well motivated, it inherits the fundamental computational hardness of probabilistic inference, making exact reasoning intractable. Probabilistic tractable models have also recently emerged, which guarantee that conditional marginals can be computed in time linear in the size of the model, where the model is usually learned from data. Although initially limited to low tree-width models, recent tractable models such as sum product networks (SPNs) and probabilistic sentential decision diagrams (PSDDs) exploit efficient function representations and also capture high tree-width models. In this paper, we ask the following technical question: can we use the distributions represented or learned by these models to perform causal queries, such as reasoning about interventions and counterfactuals? By appealing to some existing ideas on transforming such models to Bayesian networks, we answer mostly in the negative. We show that when transforming SPNs to a causal graph interventional reasoning reduces to computing marginal distributions; in other words, only trivial causal reasoning is possible. For PSDDs the situation is only slightly better. We first provide an algorithm for constructing a causal graph from a PSDD, which introduces augmented variables. Intervening on the original variables, once again, reduces to marginal distributions, but when intervening on the augmented variables, a deterministic but nonetheless causal-semantics can be provided for PSDDs.

[1]  Guy Van den Broeck,et al.  Probabilistic Sentential Decision Diagrams , 2014, KR.

[2]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[3]  Pedro M. Domingos,et al.  Sum-product networks: A new deep architecture , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[4]  Adnan Darwiche,et al.  A differential approach to inference in Bayesian networks , 2000, JACM.

[5]  Han Zhao,et al.  On the Relationship between Sum-Product Networks and Bayesian Networks , 2015, ICML.

[6]  Judea Pearl,et al.  The seven tools of causal inference, with reflections on machine learning , 2019, Commun. ACM.

[7]  Bolei Zhou,et al.  GAN Dissection: Visualizing and Understanding Generative Adversarial Networks , 2018, ICLR.

[8]  Elias Bareinboim,et al.  Budgeted Experiment Design for Causal Structure Learning , 2017, ICML.

[9]  Adnan Darwiche,et al.  New Compilation Languages Based on Structured Decomposability , 2008, AAAI.

[10]  Pedro M. Domingos,et al.  Learning the Structure of Sum-Product Networks , 2013, ICML.

[11]  J. Pearl Causal inference in statistics: An overview , 2009 .

[12]  Franz Pernkopf,et al.  On the Latent Variable Interpretation in Sum-Product Networks , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Adnan Darwiche,et al.  On Relaxing Determinism in Arithmetic Circuits , 2017, ICML.

[14]  Guy Van den Broeck,et al.  Learning the Structure of Probabilistic Sentential Decision Diagrams , 2017, UAI.

[15]  Kristian Kersting,et al.  Mixed Sum-Product Networks: A Deep Architecture for Hybrid Domains , 2018, AAAI.

[16]  Adnan Darwiche,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence SDD: A New Canonical Representation of Propositional Knowledge Bases , 2022 .

[17]  Adnan Darwiche,et al.  On the Tractable Counting of Theory Models and its Application to Truth Maintenance and Belief Revision , 2001, J. Appl. Non Class. Logics.

[18]  Guy Van den Broeck,et al.  Tractable Learning for Structured Probability Spaces: A Case Study in Learning Preference Distributions , 2015, IJCAI.

[19]  Michael I. Jordan,et al.  Thin Junction Trees , 2001, NIPS.

[20]  Toniann Pitassi,et al.  Solving #SAT and Bayesian Inference with Backtracking Search , 2014, J. Artif. Intell. Res..

[21]  Adnan Darwiche,et al.  A Logical Approach to Factoring Belief Networks , 2002, KR.

[22]  Saburo Muroga,et al.  Binary Decision Diagrams , 2000, The VLSI Handbook.

[23]  Dileep George,et al.  Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics , 2017, ICML.

[24]  Yoshua Bengio,et al.  Shallow vs. Deep Sum-Product Networks , 2011, NIPS.

[25]  Bernhard Schölkopf,et al.  Counterfactuals uncover the modular structure of deep generative models , 2018, ICLR.

[26]  Joseph Y. Halpern,et al.  Causes and Explanations: A Structural-Model Approach. Part II: Explanations , 2001, The British Journal for the Philosophy of Science.

[27]  Guy Van den Broeck,et al.  A Semantic Loss Function for Deep Learning with Symbolic Knowledge , 2017, ICML.