Language understanding and subsequential transducer learning

Language Understanding can be considered as the realization of a mapping from sentences of a natural language into a description of their meaning in an appropriate formal language. Under this viewpoint, the application of the Onward Subsequential Transducer Inference Algorithm (OSTIA) to Language Understanding is considered. The basic version of OSTIA is reviewed and a new version is presented in which syntactic restrictions of the domain and/or range of the target transduction can effectively be taken into account. For experimentation purposes, a task proposed by Feldman, Lakoff, Stolcke and Weber (1990) (International Computer Science Institute, Berkley, California) for assessing the capabilities of language learning and understanding systems has been adopted and three semantic coding schemes have been defined for this task with different sources of difficulty. In all cases the basic version of OSTIA has proved consistently to be able to learn very compact and accurate transducers from relativly small training sets of input?output examples of the task. Moreover, if the input sentences are corrupted with syntactic incorrectness or errors, the new version of OSTIA still provides understandable results that only degrade in a gradual and natural way.

[1]  Enrique Vidal,et al.  Inference of k-Testable Languages in the Strict Sense and Application to Syntactic Pattern Recognition , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Hermann Ney,et al.  Speech translation based on automatically trainable finite-state models , 1997, EUROSPEECH.

[3]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[4]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[5]  Roberto Pieraccini,et al.  Learning how to understand language , 1993, EUROSPEECH.

[6]  Enrique Vidal,et al.  Some results with a trainable speech translation and understanding system , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[7]  Francisco Casacuberta,et al.  Error correcting parsing for text-to-text machine translation using finite state models , 1997, TMI.

[8]  Jean Berstel,et al.  Transductions and context-free languages , 1979, Teubner Studienbücher : Informatik.

[9]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[10]  Emden R. Gansner,et al.  A Technique for Drawing Directed Graphs , 1993, IEEE Trans. Software Eng..

[11]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[12]  Salim Roukos,et al.  Fertility Models for Statistical Natural Language Understanding , 1997, ACL.

[13]  A. Gorin On automated language acquisition , 1989 .

[14]  Shane S. Sturrock,et al.  Time Warps, String Edits, and Macromolecules – The Theory and Practice of Sequence Comparison . David Sankoff and Joseph Kruskal. ISBN 1-57586-217-4. Price £13.95 (US$22·95). , 2000 .

[15]  Stephen E. Levinson,et al.  Adaptive acquisition of language , 1991 .

[16]  Enrique Vidal,et al.  Learning language translation in limited domains using finite-state models: some extensions and improvements , 1995, EUROSPEECH.

[17]  F. Jelinek,et al.  Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.

[18]  J. Oncina,et al.  INFERRING REGULAR LANGUAGES IN POLYNOMIAL UPDATED TIME , 1992 .

[19]  Salim Roukos,et al.  Statistical natural language understanding using hidden clumpings , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[20]  Andreas Stolcke Learning Feature-based Semantics with Simple Recurrent Networks , 1990 .

[21]  A. Castaño,et al.  Using Categories in the EUTRANS System , 1997 .

[22]  Enrique Vidal,et al.  Application of OSTIA to Machine Translation Tasks , 1994, ICGI.

[23]  Enrique Vidal,et al.  Learning Subsequential Transducers for Pattern Recognition Interpretation Tasks , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  R. J. Nelson,et al.  Introduction to Automata , 1968 .

[25]  Encarna Segarra,et al.  INDUCTIVE LEARNING OF FINITE-STATE TRANSDUCERS FOR THE INTERPRETATION OF UNIDIMENSIONAL OBJECTS , 1990 .

[26]  José Oncina,et al.  Using domain information during the learning of a subsequential transducer , 1996, ICGI.

[27]  Anil K. Jain,et al.  Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  G. Z. Sun,et al.  Grammatical Inference , 1998, Lecture Notes in Computer Science.

[29]  E. Vidal,et al.  Transducer learning in pattern recognition , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[30]  Isabelle Tellier Lifl Learning to Understand , 1998 .