Incremental Bayesian networks for structure prediction

We propose a class of graphical models appropriate for structure prediction problems where the model structure is a function of the output structure. Incremental Sigmoid Belief Networks (ISBNs) avoid the need to sum over the possible model structures by using directed arcs and incrementally specifying the model structure. Exact inference in such directed models is not tractable, but we derive two efficient approximations based on mean field methods, which prove effective in artificial experiments. We then demonstrate their effectiveness on a benchmark natural language parsing task, where they achieve state-of-the-art accuracy. Also, the model which is a closer approximation to an ISBN has better parsing accuracy, suggesting that ISBNs are an appropriate abstract model of structure prediction tasks.

[1]  James Henderson,et al.  Inducing History Representations for Broad Coverage Statistical Parsing , 2003, NAACL.

[2]  Radford M. Neal Connectionist Learning of Belief Networks , 1992, Artif. Intell..

[3]  Michael I. Jordan Learning in Graphical Models , 1999, NATO ASI Series.

[4]  Geoffrey E. Hinton,et al.  Reinforcement learning for factored Markov decision processes , 2002 .

[5]  Ben Taskar,et al.  Max-Margin Parsing , 2004, EMNLP.

[6]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[7]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[8]  Michael Collins,et al.  Hidden-Variable Models for Discriminative Reranking , 2005, HLT.

[9]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[10]  Michael I. Jordan,et al.  A Mean Field Learning Algorithm for Unsupervised Neural Networks , 1999, Learning in Graphical Models.

[11]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[12]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[13]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[14]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[15]  I. Dan Melamed,et al.  Scalable Discriminative Learning for Natural Language Parsing and Translation , 2006, NIPS.

[16]  James Henderson,et al.  Discriminative Training of a Neural Network Statistical Parser , 2004, ACL.

[17]  Leonid Peshkin,et al.  Dependency Parsing with Dynamic Bayesian Network , 2005, AAAI.