Evolutionary induction of stochastic context free grammars

This paper describes an evolutionary approach to the problem of inferring stochastic context-free grammars from finite language samples. The approach employs a distributed, steady-state genetic algorithm, with a fitness function incorporating a prior over the space of possible grammars. Our choice of prior is designed to bias learning towards structurally simpler grammars. Solutions to the inference problem are evolved by optimizing the parameters of a covering grammar for a given language sample. Full details are given of our genetic algorithm (GA) and of our fitness function for grammars. We present the results of a number of experiments in learning grammars for a range of formal languages. Finally we compare the grammars induced using the GA-based approach with those found using the inside-outside algorithm. We find that our approach learns grammars that are both compact and fit the corpus data well.

[1]  Yasubumi Sakakibara,et al.  Recent Advances of Grammatical Inference , 1997, Theor. Comput. Sci..

[2]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[3]  Andreas Stolcke,et al.  An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities , 1994, CL.

[4]  Taylor L. Booth,et al.  Grammatical Inference: Introduction and Survey-Part I , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Matthew Brand,et al.  Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction , 1999, Neural Computation.

[6]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[7]  P. Green On Use of the EM Algorithm for Penalized Likelihood Estimation , 1990 .

[8]  Laurent Miclet,et al.  Structural Methods in Pattern Recognition , 1986 .

[9]  Bill Keller,et al.  Improved Learning for Hidden Markov Models Using Penalized Training , 2002, AICS.

[10]  Keith L. Clark,et al.  Using Grammatical Inference to Automate Information Extraction from the Web , 2001, PKDD.

[11]  José Oncina,et al.  Learning Stochastic Regular Grammars by Means of a State Merging Method , 1994, ICGI.

[12]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[13]  Alexander Clark Unsupervised induction of stochastic context-free grammars using distributional clustering , 2001, CoNLL.

[14]  Pierre Dupont,et al.  Regular Grammatical Inference from Positive and Negative Samples by Genetic Search: the GIG Method , 1994, ICGI.

[15]  José-Miguel Benedí,et al.  RNA Modeling by Combining Stochastic Context-Free Grammars and n-Gram Models , 2002, Int. J. Pattern Recognit. Artif. Intell..

[16]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[17]  Terry Jones,et al.  Crossover, Macromutationand, and Population-Based Search , 1995, ICGA.

[18]  Arto Salomaa,et al.  Probabilistic and Weighted Grammars , 1969, Inf. Control..

[19]  Phil Husbands,et al.  A Comparison of Optimization Techniques for Integrated Manufacturing Planning and Scheduling , 1996, PPSN.

[20]  William I. Gasarch,et al.  Book Review: An introduction to Kolmogorov Complexity and its Applications Second Edition, 1997 by Ming Li and Paul Vitanyi (Springer (Graduate Text Series)) , 1997, SIGACT News.

[21]  Markus Schwehm,et al.  Inference of Stochastic Regular Grammars by Massively Parallel Genetic Algorithms , 1995, ICGA.

[22]  Richard K. Belew,et al.  Stochastic Context-Free Grammar Induction with a Genetic Algorithm Using Local Search , 1996, FOGA.

[23]  Miles Osborne,et al.  MDL-based DCG Induction for NP Identification , 1999, CoNLL.

[24]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[25]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[26]  Stefano Crespi-Reghizzi,et al.  An Effective Model for Grammar Interference , 1971, IFIP Congress.

[27]  Peter J. Wyard Context Free Grammar Induction Using Genetic Algorithms , 1991, ICGA.

[28]  C. M. Cook,et al.  Grammatical inference by hill climbing , 1976, Inf. Sci..

[29]  David R. Jefferson,et al.  Selection in Massively Parallel Genetic Algorithms , 1991, ICGA.

[30]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[31]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[32]  Anja Belz,et al.  An Approach to the Automatic Acquisition of Phonotactic Constraints , 1998, SIGPHON@COLING/ACL.

[33]  Dan Klein,et al.  Distributional phrase structure induction , 2001, CoNLL.

[34]  Andreas Stolcke,et al.  Bayesian learning of probabilistic language models , 1994 .

[35]  Peter J. Angeline,et al.  An evolutionary algorithm that constructs recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[36]  Michael G. Thomason,et al.  Syntactic Pattern Recognition, An Introduction , 1978, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Frederick E. Petry,et al.  Regular language induction with genetic programming , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[38]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[39]  Ian H. Witten,et al.  Learning language using genetic algorithms , 1995, Learning for Natural Language Processing.

[40]  Sandip Sen,et al.  Learning to construct pushdown automata for accepting deterministic context-free languages , 1992, Defense, Security, and Sensing.

[41]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[42]  Stanley F. Chen,et al.  Bayesian Grammar Induction for Language Modeling , 1995, ACL.

[43]  Syntactic Pattern Recognition: An introduction : Rafael C. Gonzalez and Michael G. Thomason. Addison-Wesley, Reading, MA , 1979, Pattern Recognit..

[44]  L. Dekker,et al.  Massively parallel processing applications and development : proceedings of the 1994 EUROSIM Conference on Massively Parallel Processing Applications and Development, Delft, The Netherlands, 21-23 June 1994 , 1994 .

[45]  Robert M. Losee,et al.  Learning Syntactic Rules and Tags with Genetic Algorithms for Information Retrieval and Filtering: An Empirical Basis for Grammatical Rules , 1995, Inf. Process. Manag..

[46]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[47]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[48]  Walter Daelemans,et al.  A short introduction to GRAEL grammar adaptation, evolution and learning , 2000 .

[49]  Alex Acero,et al.  Evaluation of spoken language grammar learning in the ATIS domain , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[50]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[51]  Steve Young,et al.  Applications of stochastic context-free grammars using the Inside-Outside algorithm , 1990 .

[52]  Simon M. Lucas,et al.  Structuring chromosomes for context-free grammar evolution , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[53]  Yasubumi Sakakibara,et al.  Learning context-free grammars from structural data in polynomial time , 1988, COLT '88.

[54]  J. Baker Trainable grammars for speech recognition , 1979 .

[55]  Bill Keller,et al.  Learning SCFGs from Corpora by a Genetic Algorithm , 1997, ICANNGA.

[56]  Peter Grünwald,et al.  A minimum description length approach to grammar inference , 1995, Learning for Natural Language Processing.

[57]  Marc M. Lankhorst Grammatical Inference with a Genetic Algorithm , 1994, EUROSIM.