Language Learning from Texts: Degrees of Intrinsic Complexity and Their Characterizations

Abstract This paper deals with two problems: (1) what makes languages learnable in the limit by natural strategies of varying hardness, and (2) what makes classes of languages the hardest ones to learn. To quantify hardness of learning, we use intrinsic complexity based on reductions between learning problems. Two types of reductions are considered: weak reductions mapping texts (representations of languages) to texts and strong reductions mapping languages to languages. For both types of reductions, characterizations of complete (hardest) classes in terms of their algorithmic and topological potentials have been obtained. To characterize the strong complete degree, we discovered a new and natural complete class capable of “coding” any learning problem using density of the set of rational numbers. We have also discovered and characterized rich hierarchies of degrees of complexity based on “core” natural learning problems. The classes in these hierarchies contain “multidimensional” languages, where the information learned from one dimension aids in learning other dimensions. In one formalization of this idea, the grammars learned from the dimensions 1, 2, …,  k specify the “subspace” for the dimension k +1, while the learning strategy for every dimension is predefined. In our other formalization, a “pattern” learned from the dimension k specifies the learning strategy for the dimension k +1. A number of open problems are discussed.

[1]  Carl H. Smith,et al.  On the Intrinsic Complexity of Learning , 1995, Inf. Comput..

[2]  Thomas Zeugmann,et al.  Learning Recursive Languages with Bounded Mind Changes , 1993, Int. J. Found. Comput. Sci..

[3]  Carl H. Smith,et al.  On the intrinsic complexity of learning recursive functions , 1999, COLT '99.

[4]  Arun Sharma,et al.  On the intrinsic complexity of language identification , 1994, COLT '94.

[5]  Carl H. Smith,et al.  On the Intrinsic Complexity of Learning , 1995, Inf. Comput..

[6]  Arun Sharma,et al.  Characterizing Language Identification by Standardizing Operations , 1994, J. Comput. Syst. Sci..

[7]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[8]  Frank Stephan,et al.  Language Learning from Texts: Mindchanges, Limited Memory, and Monotonicity , 1995, Inf. Comput..

[9]  Paul Young,et al.  An introduction to the general theory of algorithms , 1978 .

[10]  Efim B. Kinber Monotonicity versus Efficiency for Learning Languages from Texts , 1994, AII/ALT.

[11]  Jerome A. Feldman,et al.  Some Decidability Results on Grammatical Inference and Complexity , 1972, Inf. Control..

[12]  Manuel Blum,et al.  A Machine-Independent Theory of the Complexity of Recursive Functions , 1967, JACM.

[13]  Daniel N. Osherson,et al.  Systems That Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists , 1990 .

[14]  Patrick Brézillon,et al.  Lecture Notes in Artificial Intelligence , 1999 .

[15]  Rusins Freivalds,et al.  Inductive Inference of Recursive Functions: Qualitative Theory , 1991, Baltic Computer Science.

[16]  Jr. Hartley Rogers Theory of Recursive Functions and Effective Computability , 1969 .

[17]  Daniel N. Osherson,et al.  Criteria of Language Learning , 1982, Inf. Control..

[18]  Mark A. Fulk Prudence and Other Conditions on Formal Language Learning , 1990, Inf. Comput..

[19]  John Case The power of vacillation , 1988, COLT '88.

[20]  Arun Sharma,et al.  The Intrinsic Complexity of Language Identification , 1996, J. Comput. Syst. Sci..

[21]  John Case,et al.  Machine Inductive Inference and Language Identification , 1982, ICALP.

[22]  Manuel Blum,et al.  Toward a Mathematical Theory of Inductive Inference , 1975, Inf. Control..

[23]  Arun Sharma,et al.  The Structure of Intrinsic Complexity of Learning , 1997, J. Symb. Log..

[24]  Rolf Wiehagen,et al.  On the Complexity of Program Synthesis from Examples , 1986, J. Inf. Process. Cybern..

[25]  John Case,et al.  Comparison of Identification Criteria for Machine Inductive Inference , 1983, Theor. Comput. Sci..

[26]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .