SCREEN: learning a flat syntactic and semantic spoken language analysis using artificial neural networks

Previous approaches of analyzing spontaneously spoken language often have been based on encoding syntactic and semantic knowledge manually and symbolically. While there has been some progress using statistical or connectionist language models, many current spoken-language systems still use a relatively brittle, hand-coded symbolic grammar or symbolic semantic component. In contrast, we describe a so-called screening approach for learning robust processing of spontaneously spoken language. A screening approach is a flat analysis which uses shallow sequences of category representations for analyzing an utterance at various syntactic, semantic and dialog levels. Rather than using a deeply structured symbolic analysis, we use a flat connectionist analysis. This screening approach aims at supporting speech and language processing by using (1) data-driven learning and (2) robustness of connectionist networks. In order to test this approach, we have developed the screen system which is based on this new robust, learned and flat analysis. In this paper, we focus on a detailed description of screen's architecture, the flat syntactic and semantic analysis, the interaction with a speech recognizer, and a detailed evaluation analysis of the robustness under the influence of noisy or incomplete input. The main result of this paper is that flat representations allow more robust processing of spontaneous spoken language than deeply structured representations. In particular, we show how the fault-tolerance and learning capability of connectionist networks can support a flat analysis for providing more robust spoken-language processing within an overall hybrid symbolic/connectionist framework.

[1]  David S. Touretzky,et al.  Connectionist Models and Linguistic Theory: Investigations of Stress Systems in Language , 1993, Cogn. Sci..

[2]  Hans Weber,et al.  An investigation of tightly coupled time synchronous speech language interfaces using a unification grammar , 1994 .

[3]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[4]  Stefan Wermter,et al.  Interactive Spoken-Language Processing in a Hybrid Connectionist System , 1996, Computer.

[5]  James F. Allen The TRAINS-95 Parsing System: A User''s Manual , 1995 .

[6]  Alexander H. Waibel,et al.  Learning complex output representations in connectionist parsing of spoken language , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Walther von Hahn,et al.  System Architectures for Speech Understanding and Language Processing , 1995 .

[8]  Risto Miikkulainen Subsymbolic Case-Role Analysis of Sentences with Embedded Clauses , 1993 .

[9]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[10]  James F. Allen,et al.  Tagging Speech Repairs , 1994, HLT.

[11]  Risto Miikkulainen,et al.  Subsymbolic natural language processing - an integrated model of scripts, lexicon, and memory , 1993, Neural network modeling and connectionism.

[12]  Chung Hee Hwang,et al.  The TRAINS project: a case study in building a conversational planning agent , 1994, J. Exp. Theor. Artif. Intell..

[13]  R. Reilly,et al.  Connectionist approaches to natural language processing , 1994 .

[14]  Ron Sun,et al.  Integrating rules and connectionism for robust commonsense reasoning , 1994, Sixth-generation computer technology series.

[15]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[16]  Ellen Riloff,et al.  Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing , 1996, Lecture Notes in Computer Science.

[17]  Chris Mellish,et al.  Some Chart-Based Techniques for Parsing Ill-Formed Input , 1989, ACL.

[18]  Andreas Stolcke,et al.  The berkeley restaurant project , 1994, ICSLP.

[19]  Larry R. Medsker,et al.  Hybrid Neural Network and Expert Systems , 1994, Springer US.

[20]  Kanaan A. Faisal,et al.  Design of a Hybrid Deterministic Parser , 1990, COLING.

[21]  Eugene Charniak,et al.  Statistical language learning , 1997 .

[22]  Wolfgang Menzel,et al.  Parsing of Spoken Language under Time Constraints , 1994, ECAI.

[23]  Michael I. Jordan,et al.  Task Decomposition Through Competition in a Modular Connectionist Architecture: The What and Where Vision Tasks , 1990, Cogn. Sci..

[24]  Ajay N. Jain,et al.  Parsing Complex Sentences with Structured Connectionist Networks , 1991, Neural Computation.

[25]  Julia Hirschberg,et al.  Pitch Accent in Context: Predicting Intonational Prominence from Text , 1993, Artif. Intell..

[26]  Monika Woszczyna,et al.  JANUS: Speech-to-Speech Translation Using Connectionist and Non-Connectionist Techniques , 1991, NIPS.

[27]  Michael I. Jordan,et al.  Hierarchies of Adaptive Experts , 1991, NIPS.

[28]  Stefan Wermter,et al.  Learning dialog act processing , 1996, COLING.

[29]  Stefan Wermter Hybrid Connectionist Natural Language Processing , 1994 .

[30]  Wayne H. Ward,et al.  High level knowledge sources in usable speech recognition systems , 1989, CACM.

[31]  James F. Allen,et al.  Deyecting and Correcting Speech Repairs , 1994, ACL.

[32]  Andreas Stolcke,et al.  Multiple-pronunciation lexical modeling in a speaker independent speech understanding system , 1994, ICSLP.

[33]  James A. Hendler,et al.  Marker‐passing over Microfeatures: Towards a Hybrid Symbolic/Connectionist Model , 1989 .

[34]  Ellen Riloff,et al.  Automatically Constructing a Dictionary for Information Extraction Tasks , 1993, AAAI.