Colin de la Higuera: Grammatical inference: learning automata and grammars

Grammatical inference (GI) is concerned with the study of algorithms for learning automata and grammars from strings. Colin de la Higuera sees a subtle, informal, and almost philosophical, distinction between grammar induction (finding a grammar that explains as much as possible of the data) and grammatical inference (finding the true or only target grammar covering a set of strings). The latter puts more emphasis on the learning process. Although this field has existed for a long time, almost twenty years with its own conference (International Colloquium on Grammatical Inference, ICGI), this is the first monograph on the subject. It has the aim of bringing together the most salient concepts and results, until now dispersed over different publications, in a uniform notation and framework. In Chap. 1, the author surveys the scientific areas that have contributed to the field of grammatical inference. Computational linguistics, obviously, with its formal work on language learnability, and inductive inference, which studies questions about what classes of functions can be learned, how fast, and how to measure learning success. There is also pattern recognition and machine learning (especially complexity results in computational learning theory) and computational biology (DNA analysis). The latter is an application area of GI that has contributed to the main results in that field. The introduction also establishes the main research questions and methodology of GI. Strings and trees are ubiquitous in language data, biological data, and computer programs, but also in processed image data, music, chemical data, etc. These data and the potential for GI to work on it are described in Chap. 2. Machine Translation is mentioned in this book only in passing, but some of the techniques described in the book have also been used in this area. For example, Chap. 18 describes in detail the OSTIA system for learning transducers that may have some potential as a model for