Progressing the state-of-the-art in grammatical inference by competition: The Omphalos Context-Free Language Learning Competition

This paper describes the Omphalos Context-Free Language Learning Competition held as part of the International Colloquium on Grammatical Inference 2004. After the success of the Abbadingo Competition on the better known task of learning regular languages, the competition was created in an effort to promote the development of new and better grammatical inference algorithms for context-free languages, to provide a forum for the comparison of different grammatical inference algorithms and to gain insight into the current state-of-the-art of context-free grammatical inference algorithms.This paper discusses design issues and decisions made when creating the competition, leading to the introduction of a new complexity measure developed to estimate the difficulty of learning a context-free grammar. It presents also the results of the competition and lessons learned.

[1]  Dan Klein,et al.  Distributional phrase structure induction , 2001, CoNLL.

[2]  Azriel Rosenfeld,et al.  Grammatical inference by hill climbing , 1976, Inf. Sci..

[3]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[4]  Leonard Pitt,et al.  Reductions among prediction problems: on the difficulty of predicting automata , 1988, [1988] Proceedings. Structure in Complexity Theory Third Annual Conference.

[5]  Alaa A. Kharbouch,et al.  Three models for the description of language , 1956, IRE Trans. Inf. Theory.

[6]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1993, JACM.

[7]  Katsuhiko Nakamura,et al.  Incremental Learning of Context Free Grammars , 2002, ICGI.

[8]  Friedrich Otto,et al.  On Deciding the Confluence of a Finite String-Rewriting System on a Given Congruence Class , 1987, J. Comput. Syst. Sci..

[9]  Colin de la Higuera,et al.  A bibliographical study of grammatical inference , 2005, Pattern Recognit..

[10]  Alexander Clark Unsupervised induction of stochastic context-free grammars using distributional clustering , 2001, CoNLL.

[11]  Barak A. Pearlmutter,et al.  Results of the Abbadingo One DFA Learning Competition and a New Evidence-Driven State Merging Algorithm , 1998, ICGI.

[12]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within and polynomial , 1989, STOC '89.

[13]  Rajesh Parekh,et al.  Learning DFA from Simple Examples , 1997, Machine Learning.

[14]  Colin de la Higuera,et al.  Inferring Deterministic Linear Languages , 2002, COLT.

[15]  Menno van Zaanen ABL: Alignment-Based Learning , 2000, COLING.

[16]  Andrew Roberts,et al.  A Multilingual Parallel Parsed Corpus as Gold Standard for Grammatical Inference Evaluation , 2004 .

[17]  J. Davenport Editor , 1960 .

[18]  Colin de la Higuera,et al.  Representing Languages by Learnable Rewriting Systems , 2004, ICGI.

[19]  James Jay Horning,et al.  A study of grammatical inference , 1969 .

[20]  Mark-Jan Nederhof,et al.  Practical Experiments with Regular Approximation of Context-Free Languages , 1999, CL.

[21]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[22]  M. W. Shields An Introduction to Automata Theory , 1988 .

[23]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[24]  KearnsMichael,et al.  Cryptographic limitations on learning Boolean formulae and finite automata , 1994 .

[25]  José M. Sempere,et al.  Learning Locally Testable Even Linear Languages from Positive Data , 2002, ICGI.

[26]  D. S. Johnson,et al.  Proceedings of the twenty-first annual ACM symposium on Theory of computing , 1989, STOC 1989.

[27]  Dana Angluin Queries revisited , 2004, Theor. Comput. Sci..

[28]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[29]  Alexander Clark,et al.  Unsupervised Language Acquisition: Theory and Practice , 2002, ArXiv.

[30]  Alexander Clark,et al.  Learning deterministic context free grammars: The Omphalos competition , 2006, Machine Learning.

[31]  Colin de la Higuera,et al.  Characteristic Sets for Polynomial Grammatical Inference , 1997, Machine Learning.

[32]  Henning Fernau,et al.  The Boisdale Algorithm - An Induction Method for a Subclass of Unification Grammar from Positive Data , 2004, ICGI.

[33]  Takashi Yokomori,et al.  Polynomial-time identification of very simple grammars from positive data , 2003, Theor. Comput. Sci..

[34]  Menno van Zaanen,et al.  ABL: Alignment-Based Learning , 2000, COLING.

[35]  Yasubumi Sakakibara,et al.  Recent Advances of Grammatical Inference , 1997, Theor. Comput. Sci..

[36]  Bradford Starkie Left Aligned Grammars - Identifying a class of Context-free Grammar in the limit from Positive Data , 2003, ECML Workshop on Learning Contex-Free Grammars.

[37]  Leslie G. Valiant,et al.  Cryptographic limitations on learning Boolean formulae and finite automata , 1994, JACM.

[38]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.