论文信息 - Model Merging versus Model Splitting Context-Free Grammar Induction

Model Merging versus Model Splitting Context-Free Grammar Induction

When comparing different grammatical inference algorithms, it becomes evident that generic techniques have been used in different systems. Several finite-state learning algorithms use state-merging as their underlying technique and a collection of grammatical inference algorithms that aim to learn context-free grammars build on the concept of substitutability to identify potential grammar rules. When learning context-free grammars, there are essentially two approaches: model merging, which generalizes with more data, and model splitting, which specializes with more data. Both approaches can be combined sequentially in a generic framework. In this article, we investigate the impact of different approaches within the first phase of the framework on system performance.

Nanne van Noord | Menno van Zaanen

[1] Rens Bod,et al. Unsupervised Parsing with U-DOP , 2006, CoNLL.

[2] Franco M. Luque,et al. Bounding the Maximal Parsing Performance of Non-Terminally Separated Grammars , 2010, ICGI.

[3] Eytan Ruppin,et al. Unsupervised learning of natural languages , 2006 .

[4] Anja Belz. PCFG Learning by Nonterminal Partition Search , 2002, ICGI.

[5] Menno van Zaanen,et al. Computational Grammatical Inference , 2006 .

[6] Pieter W. Adriaans,et al. The EMILE 4.1 Grammar Induction Toolbox , 2002, ICGI.

[7] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[8] Barak A. Pearlmutter,et al. Results of the Abbadingo One DFA Learning Competition and a New Evidence-Driven State Merging Algorithm , 1998, ICGI.

[9] Dan Klein,et al. Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency , 2004, ACL.

[10] M. van Zaanen,et al. Computational Language Learning , 2010 .

[11] Menno van Zaanen,et al. Bootstrapping structure into language : alignment-based learning , 2001, ArXiv.