Bayesian model averaging and model selection for markov equivalence classes of acyclic digraphs

Acyclic digraphs (ADGs) are widely used to describe dependences among variables in multivariate distributions. In particular, the likelihood functions of ADG models admit convenient recursive factorizations that often allow explicit maximum likelihood estimates and that are well suited to building Bayesian networks for expert systems. There may, however, be many ADGs that determine the same dependence (= Markov) model. Thus, the family of all ADGs with a given set of vertices is naturally partitioned into Markov-equivalence classes, each class being associated with a unique statistical model. Statistical procedures, such as model selection or model averaging, that fail to take into account these equivalence classes, may incur substantial computational or other inefficiencies. Recent results have shown that each Markov-equivalence class is uniquely determined by a single chain graph, the essential graph, that is itself Markov-equivalent simultaneously to all ADGs in the equivalence class. Here we propose t...

[1]  Hansen Jf The clinical diagnosis of ischaemic heart disease due to coronary artery disease. , 1980 .

[2]  D. A. Kenny,et al.  Correlation and Causation , 1937, Wilmott.

[3]  D. Edwards,et al.  A fast procedure for model search in multidimensional contingency tables , 1985 .

[4]  Adrian F. M. Smith,et al.  Bayesian computation via the gibbs sampler and related markov chain monte carlo methods (with discus , 1993 .

[5]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[6]  E. Fowlkes,et al.  Evaluating Logistic Models for Large Contingency Tables , 1988 .

[7]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[8]  P. Games Correlation and Causation: A Logical Snafu , 1990 .

[9]  Richard E. Neapolitan,et al.  Probabilistic reasoning in expert systems - theory and algorithms , 2012 .

[10]  David J. Spiegelhalter,et al.  Sequential updating of conditional probabilities on directed graphical structures , 1990, Networks.

[11]  M. Frydenberg The chain graph Markov property , 1990 .

[12]  Steffen L. Lauritzen,et al.  Independence properties of directed markov fields , 1990, Networks.

[13]  Jens Damgaard Andersen,et al.  STENO:an expert system for medical diagnosis based on graphical models and model search , 1991 .

[14]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[15]  Judea Pearl,et al.  An Algorithm for Deciding if a Set of Observed Independencies Has a Causal Explanation , 1992, UAI.

[16]  D. Heckerman,et al.  Toward Normative Expert Systems: Part I The Pathfinder Project , 1992, Methods of Information in Medicine.

[17]  Bo ThiessonApril Bifrost { Block Recursive Models Induced from Relevant Knowledge, Observations, and Statistical Techniques , 1993 .

[18]  J. Besag,et al.  Spatial Statistics and Bayesian Computation , 1993 .

[19]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[20]  David J. Spiegelhalter,et al.  Bayesian analysis in expert systems , 1993 .

[21]  A. Dawid,et al.  Hyper Markov Laws in the Statistical Analysis of Decomposable Graphical Models , 1993 .

[22]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[23]  Peter Green,et al.  Spatial statistics and Bayesian computation (with discussion) , 1993 .

[24]  L. Tierney Markov Chains for Exploring Posterior Distributions , 1994 .

[25]  D. Madigan,et al.  Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window , 1994 .

[26]  Russell G. Almond,et al.  Strategies for Graphical Model Selection , 1994 .

[27]  Wray L. Buntine Operations for Learning with Graphical Models , 1994, J. Artif. Intell. Res..

[28]  R. T. Lie,et al.  Birth Defects Registered by Double Sampling: A Bayesian Approach Incorporating Covariates and Model Uncertainty , 1995 .

[29]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[30]  Christopher Meek,et al.  Causal inference and causal explanation with background knowledge , 1995, UAI.

[31]  D. Madigan,et al.  Eliciting prior information to enhance the predictive performance of Bayesian graphical models , 1995 .

[32]  David Maxwell Chickering,et al.  A Transformational Characterization of Equivalent Bayesian Network Structures , 1995, UAI.

[33]  David Heckerman,et al.  A Characterization of the Dirichlet Distribution with Application to Learning Bayesian Networks , 1995, UAI.

[34]  D. Madigan,et al.  A characterization of Markov equivalence classes for acyclic digraphs , 1997 .

[35]  D. Madigan,et al.  On the Markov Equivalence of Chain Graphs, Undirected Graphs, and Acyclic Digraphs , 1997 .