Learning correction grammars

We investigate a new paradigm in the context of learning in the limit: learning correction grammars for classes of r.e. languages. Knowing a language may feature a representation of the target language in terms of two sets of rules (two grammars). The second grammar is used to make corrections to the first grammar. Such a pair of grammars can be seen as a single description of (or grammar for) the language. We call such grammars correction grammars. Correction grammars capture the observable fact that people do correct their linguistic utterances during their usual linguistic activities. We show that learning correction grammars for classes of r.e. languages in the TxtEx-model (i.e., converging to a single correct correction grammar in the limit) is sometimes more powerful than learning ordinary grammars even in the TxtBc-model (where the learner is allowed to converge to infinitely many syntactically distinct but correct conjectures in the limit). For each n =≥ 0, there is a similar learning advantage, where we compare learning correction grammars that make n + 1 corrections to those that make n corrections. The concept of a correction grammar can be extended into the constructive transfinite, using the idea of counting-down from notations for transfinite constructive ordinals. For u a notation in Kleene's general system (O, <o) of ordinal notations, we introduce the concept of an u-correction grammar, where u is used to bound the number of corrections that the grammar is allowed to make.We prove a general hierarchy result: if u and v are notations for constructive ordinals such that u <o v, then there are classes of r.e. languages that can be TxtEx-learned by conjecturing v-correction grammars but not by conjecturing u-correction grammars. Surprisingly, we show that -- above "ω-many" corrections -- it is not possible to strengthen the hierarchy: TxtEx-learning u-correction grammars of classes of r.e. languages, where u is a notation in O for any ordinal, can be simulated by TxtBc-learning w-correction grammars, where w is any notation for the smallest infinite ordinal ω.

[1]  D. Osherson,et al.  Learning theory and natural language , 1984, Cognition.

[2]  Andris Ambainis,et al.  Parsimony hierarchies for inductive inference , 2004, Journal of Symbolic Logic.

[3]  Manuel Blum,et al.  Toward a Mathematical Theory of Inductive Inference , 1975, Inf. Control..

[4]  Carl H. Smith,et al.  On the Role of Procrastination in Machine Learning , 1993, Inf. Comput..

[5]  J. Case,et al.  Subrecursive Programming Systems: Complexity & Succinctness , 1994 .

[6]  James S. Royer,et al.  Subrecursive Programming Systems , 1994, Progress in Theoretical Computer Science.

[7]  D. Osherson,et al.  A note on formal learning theory , 1982, Cognition.

[8]  Keh-Jiann Chen Tradeoffs in the Inductive Inference of Nearly Minimal Size Programs , 1982, Inf. Control..

[9]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[10]  K. Wexler On extensional learnability , 1982, Cognition.

[11]  Wilfried Buchholz,et al.  Proof-Theoretic Analysis of Termination Proofs , 1995, Ann. Pure Appl. Log..

[12]  C. Yates Recursive Functions , 1970, Nature.

[13]  R. Epstein,et al.  Hierarchies of sets and degrees below 0 , 1981 .

[14]  John Case,et al.  Machine learning of higher-order programs , 1994 .

[15]  Dana Angluin,et al.  Inductive Inference of Formal Languages from Positive Data , 1980, Inf. Control..

[16]  Daniel N. Osherson,et al.  Criteria of Language Learning , 1982, Inf. Control..

[17]  R. V. Freivald Minimal Gödel Numbers and Their Identification in the Limit , 1975, MFCS.

[18]  S. Pinker Formal models of language learning , 1979, Cognition.

[19]  Marcus Schaefer A guided tour of minimal indices and shortest descriptions , 1998, Arch. Math. Log..

[20]  Arun Sharma,et al.  Program Size Restrictions in Computational Learning , 1994, Theor. Comput. Sci..

[21]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[22]  Manuel Blum,et al.  A Machine-Independent Theory of the Complexity of Recursive Functions , 1967, JACM.

[23]  Robert H. Sloan,et al.  BOOK REVIEW: "SYSTEMS THAT LEARN: AN INTRODUCTION TO LEARNING THEORY, SECOND EDITION", SANJAY JAIN, DANIEL OSHERSON, JAMES S. ROYER and ARUN SHARMA , 2001 .

[24]  Y. Ershov A hierarchy of sets. I , 1968 .

[25]  Yu. L. Ershov,et al.  On a hierarchy of sets. III , 1968 .

[26]  Rusins Freivalds Inductive Inference of Minimal Programs , 1990, COLT.

[27]  Michael Rathjen,et al.  The Realm of Ordinal Analysis , 2007 .

[28]  Paul Young,et al.  An introduction to the general theory of algorithms , 1978 .

[29]  John Case,et al.  Learning correction grammars , 2009 .

[30]  Andreas Weiermann Proving Termination for Term Rewriting Systems , 1991, CSL.

[31]  Kenneth Wexler,et al.  Formal Principles of Language Acquisition , 1980 .

[32]  Stephen Cole Kleene,et al.  On notation for ordinal numbers , 1938, Journal of Symbolic Logic.

[33]  S. Kleene On the Forms of the Predicates in the Theory of Constructive Ordinals (Second Paper) , 1955 .

[34]  John Case,et al.  The Power of Vacillation in Language Learning , 1999, SIAM J. Comput..

[35]  Daniel N. Osherson,et al.  Systems That Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists , 1990 .

[36]  John Case,et al.  Machine Inductive Inference and Language Identification , 1982, ICALP.

[37]  Christopher J. Ash,et al.  Recursive Structures and Ershov's Hierarchy , 1996, Math. Log. Q..

[38]  John Case,et al.  Comparison of Identification Criteria for Machine Inductive Inference , 1983, Theor. Comput. Sci..

[39]  S. Kleene On the Forms of the Predicates in the Theory of Constructive Ordinals , 1944 .

[40]  Jr. Hartley Rogers Theory of Recursive Functions and Effective Computability , 1969 .

[41]  John Case,et al.  On Learning Limiting Programs , 1992, Int. J. Found. Comput. Sci..