Synthesizing Learners Tolerating Computable Noisy Data

An index for an r.e. class of languages (by definition) generates a sequence of grammars defining the class. An index for an indexed family of recursive languages (by definition) generates a sequence of decision procedures defining the family. F. Stephan's model of noisy data is employed, in which, roughly, correct data crops up infinitely often and incorrect data only finitely often. In a computable universe, all data sequences, even noisy ones, are computable. New to the present paper is the restriction that noisy data sequences be, nonetheless, computable. This restriction is interesting since we may live in a computable universe. Studied, then, is the synthesis from indices for r.e. classes and for indexed families of recursive languages of various kinds of noise-tolerant language-learners for the corresponding classes or families indexed, where the noisy input data sequences are restricted to being computable. Many positive results, as well as some negative results, are presented regarding the existence of such synthesizers. The main positive result is: grammars for each indexed family can be learned behaviorally correctly from computable, noisy, positive data. The proof of another positive synthesis result yields, as a pleasant corollary, a strict subset-principle or telltale style characterization, for the computable noise-tolerant behaviorally correct learnability of grammars from positive and negative data, of the corresponding families indexed.

[1]  Ayumi Shinohara,et al.  Knowledge Acquisition from Amino Acid Sequences by Machine Learning System BONSAI , 1992 .

[2]  Mark A. Fulk Robust separations in inductive inference , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[3]  John Case,et al.  Synthesizing noise-tolerant language learners , 2001, Theor. Comput. Sci..

[4]  Stuart A. Kurtz,et al.  Prudence in language learning , 1988, COLT '88.

[5]  Manuel Blum,et al.  A Machine-Independent Theory of the Complexity of Recursive Functions , 1967, JACM.

[6]  Klaus P. Jantke Automatic synthesis of programs and inductive inference of functions , 1979, FCT.

[7]  T. Shinohara INFERRING UNIONS OF TWO PATTERN LANGUAGES , 1983 .

[8]  Yasuhito Mukouchi,et al.  Characterization of Finite Identification , 1992, AII.

[9]  R. Feynman Simulating physics with computers , 1999 .

[10]  Setsuo Ohsuga,et al.  Information Modelling and Knowledge Bases , 1990 .

[11]  Hiroki Arimura,et al.  Inductive Inference of Prolog Programs with Linear Data Dependency from Positive Data , 1993 .

[12]  Rolf Wiehagen,et al.  Identification of Formal Languages , 1977, MFCS.

[13]  Stuart C. Shapiro,et al.  Encyclopedia of artificial intelligence, vols. 1 and 2 (2nd ed.) , 1992 .

[14]  Daniel N. Osherson,et al.  Synthesizing Inductive Expertise , 1988, Inf. Comput..

[15]  Klaus P. Jantke,et al.  Natural Properties of Strategies Identifying Recursive Functions , 1979, J. Inf. Process. Cybern..

[16]  John Case,et al.  Comparison of Identification Criteria for Machine Inductive Inference , 1983, Theor. Comput. Sci..

[17]  Patrick Brézillon,et al.  Lecture Notes in Artificial Intelligence , 1999 .

[18]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[19]  John Case,et al.  The Power of Vacillation in Language Learning , 1999, SIAM J. Comput..

[20]  Akihiro Yamamoto,et al.  Learning Elementary Formal Systems , 1992, Theor. Comput. Sci..

[21]  Tommaso Toffoli,et al.  Cellular Automata Machines , 1987, Complex Syst..

[22]  Thomas Zeugmann,et al.  Monotonic and Dual Monotonic Language Learning , 1996, Theor. Comput. Sci..

[23]  Daniel N. Osherson,et al.  Systems That Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists , 1990 .

[24]  Thomas Zeugmann,et al.  Characterizations of Monotonic and Dual Monotonic Language Learning , 1995, Inf. Comput..

[25]  Keith Wright Identification of unions of languages drawn from an identifiable class , 1989, COLT '89.

[26]  N. Shapiro Review: E. Mark Gold, Limiting Recursion; Hilary Putnam, Trial and Error Predicates and the Solution to a Problem of Mostowski , 1971 .

[27]  Heikki Mannila,et al.  MDL learning of unions of simple pattern languages from positive examples , 1995, EuroCOLT.

[28]  Setsuo Arikawa,et al.  Pattern Inference , 1995, GOSLER Final Report.

[29]  Dick de Jongh,et al.  Angluin's theorem for indexed families of r.e. sets and applications , 1996, COLT '96.

[30]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[31]  Claude E. Shannon,et al.  Computability by Probabilistic Machines , 1970 .

[32]  Thomas Zeugmann,et al.  A Guided Tour Across the Boundaries of Learning Recursive Languages , 1995, GOSLER Final Report.

[33]  Sanjay Jain,et al.  Learning in the presence of inaccurate information , 1989, COLT '89.

[34]  Robert C. Berwick,et al.  The acquisition of syntactic knowledge , 1985 .

[35]  Konrad Zuse,et al.  Rechnender Raum , 1991, Physik und Informatik.

[36]  Gisela Schäfer Some results in the theory of effective program synthesis: learning by defective information , 1985 .

[37]  Dayanand S. Rajan,et al.  Spatial/kinematic domain and lattice computers , 1994, J. Exp. Theor. Artif. Intell..

[38]  John Case,et al.  Machine Inductive Inference and Language Identification , 1982, ICALP.

[39]  Sanjay Jain Program Synthesis in the Presence of Infinite Number of Inaccuracies , 1996, J. Comput. Syst. Sci..

[40]  John Case,et al.  Incremental Concept Learning for Bounded Data Mining , 1997, Inf. Comput..

[41]  John Case,et al.  The Synthesis of Language Learners , 1999, Inf. Comput..

[42]  Frank Stephan Noisy Inference and Oracles , 1997, Theor. Comput. Sci..

[43]  John Case,et al.  Vacillatory and BC learning on noisy data , 1996, Theor. Comput. Sci..

[44]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[45]  R. Soare Recursively enumerable sets and degrees , 1987 .

[46]  Dana Angluin,et al.  Inductive Inference of Formal Languages from Positive Data , 1980, Inf. Control..

[47]  Gianfranco Bilardi,et al.  Language Learning without Overgeneralization , 1992, STACS.

[48]  John Gill,et al.  Computational Complexity of Probabilistic Turing Machines , 1977, SIAM J. Comput..

[49]  S. Kapur Computational Learning of Languages , 1992 .

[50]  John Case,et al.  Representing the Spatial/Kinematic Domain and Lattice Computers , 1992, AII.

[51]  Manuel Blum,et al.  Toward a Mathematical Theory of Inductive Inference , 1975, Inf. Control..

[52]  J. Case,et al.  Subrecursive Programming Systems: Complexity & Succinctness , 1994 .

[53]  P. Odifreddi Classical recursion theory , 1989 .

[54]  Rolf Wiehagen,et al.  Research in the theory of inductive inference by GDR mathematicians - A survey , 1980, Inf. Sci..