Heuristic Greedy Search Algorithms for Latent Variable Models

A Bayesian network consists of two distinct parts: a directed acyclic graph (DAG or belief-network structure) and a set of parameters for the DAG. The DAG in a Bayesian network can be used to represent both causal hypotheses and sets of probability distributions. Under the causal interpretation, a DAG represents the causal relations in a given population with a set of vertices V when there is an edge from A to B if and only if A is a direct cause of B relative to V. (We adopt the convention that sets of variables are capitalized and boldfaced, and individual variables are capitalized and italicized.) Under the statistical interpretation a DAG G can be taken to represent a set of all distributions all of which share a set of conditional independence relations that are entailed by satisfying a local directed Markov property (defined below).