Spoken language understanding: a survey

A survey of research on spoken language understanding is presented. It covers aspects of knowledge representation, automatic interpretation strategies, semantic grammars, conceptual language models, semantic event detection, shallow semantic parsing, semantic classification, semantic confidence, active learning.

[1]  Katsuhito Sudoh,et al.  Tightly integrated spoken language understanding using word-to-concept translation , 2005, INTERSPEECH.

[2]  William A. Woods,et al.  Computational Linguistics Transition Network Grammars for Natural Language Analysis , 2022 .

[3]  Günther Ruske,et al.  Hierarchical language models for one-stage speech interpretation , 2005, INTERSPEECH.

[4]  Gökhan Tür,et al.  Active learning for spoken language understanding , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[5]  Gokhan Tur,et al.  Multitask Learning for Spoken Language Understanding , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6]  Elmar Nöth,et al.  Dialog act classification with the help of prosody , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7]  Roberto Pieraccini,et al.  Stochastic automata for language modeling , 1996, Comput. Speech Lang..

[8]  Matthew Purver,et al.  Robust interpretation in dialogue by combining confidence scores with contextual features , 2006, INTERSPEECH.

[9]  Steven Abney,et al.  Parsing By Chunks , 1991 .

[10]  Nils J. Nilsson,et al.  Probabilistic Logic * , 2022 .

[11]  Nils J. Nilsson,et al.  Principles of Artificial Intelligence , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Dilek Z. Hakkani-Tür,et al.  Active learning for automatic speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Chung-Hsien Wu,et al.  Speech act modeling in a spoken dialog system using a fuzzy fragment-class Markov model , 2002, Speech Commun..

[14]  Michael Johnston,et al.  Balancing data-driven and rule-based approaches in the context of a multimodal conversational system , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[15]  Victor R. Lesser,et al.  The Hearsay-II Speech-Understanding System: Integrating Knowledge to Resolve Uncertainty , 1980, CSUR.

[16]  Mikio Nakano,et al.  Evaluating discourse understanding in spoken dialogue systems , 2003, TSLP.

[17]  Gökhan Tür,et al.  Unsupervised and active learning in automatic speech recognition for call classification , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Suzanne Stevenson,et al.  Unsupervised Semantic Role Labellin , 2004, EMNLP.

[19]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[20]  Frédéric Béchet,et al.  Sequential Decision Strategies for Machine Interpretation of Speech , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  James F. Allen,et al.  Deep Linguistic Processing for Spoken Dialogue Systems , 2007, ACL 2007.

[22]  Géraldine Damnati,et al.  Exploiting semantic relations for a spoken language understanding application , 2006, INTERSPEECH.

[23]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[24]  John D. Lafferty,et al.  Towards History-based Grammars: Using Richer Models for Probabilistic Parsing , 1993, ACL.

[25]  Joseph Polifroni,et al.  Recognition confidence scoring and its use in speech understanding systems , 2002, Comput. Speech Lang..

[26]  Sadaoki Furui,et al.  A multi-stage approach for Thai spoken language understanding , 2006, Speech Commun..

[27]  Gökhan Tür,et al.  The AT&T spoken language understanding system , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[28]  Andreas Stolcke,et al.  Finding consensus in speech recognition: word error minimization and other applications of confusion networks , 2000, Comput. Speech Lang..

[29]  Christopher D. Manning,et al.  Joint Learning Improves Semantic Role Labeling , 2005, ACL.

[30]  Roger K. Moore Computer Speech and Language , 1986 .

[31]  Robert E. Schapire,et al.  Boosting with prior knowledge for call classification , 2005, IEEE Transactions on Speech and Audio Processing.

[32]  Wai Lam,et al.  To believe is to understand , 1999, EUROSPEECH.

[33]  Shrikanth S. Narayanan,et al.  Adaptive categorical understanding for spoken dialogue systems , 2005, IEEE Transactions on Speech and Audio Processing.

[34]  Peter Norvig,et al.  Inference in Text Understanding , 1987, AAAI.

[35]  Julia Hirschberg,et al.  Prosodic and other cues to speech recognition failures , 2004, Speech Commun..

[36]  Frédéric Béchet,et al.  A language model combining n-grams and stochastic finite state automata , 1999, EUROSPEECH.

[37]  Stephanie Seneff TINA. A probabilistic syntactic parser for speech understanding systems , 1989 .

[38]  Klaus Zechner Automatic Construction of Frame Representations for Spontaneous Speech in Unrestricted Domains , 1998, COLING-ACL.

[39]  Srinivas Narayanan,et al.  Moving Right Along: A Computational Model of Metaphoric Reasoning about Events , 1999, AAAI/IAAI.

[40]  Renato De Mori,et al.  Spoken Dialogues with Computers , 1998 .

[41]  Hui Jiang,et al.  Confidence measures for speech recognition: A survey , 2005, Speech Commun..

[42]  Wayne H. Ward,et al.  High level knowledge sources in usable speech recognition systems , 1989, CACM.

[43]  Bob Carpenter,et al.  Vector-based Natural Language Call Routing , 1999, Comput. Linguistics.

[44]  Katsuhito Sudoh,et al.  Post-dialogue confidence scoring for unsupervised statistical language model training , 2005, Speech Commun..

[45]  Frédéric Béchet,et al.  Conditional use of word lattices, confusion networks and 1-best string hypotheses in a sequential interpretation strategy , 2007, INTERSPEECH.

[46]  Luc De Raedt,et al.  Probabilistic logic learning , 2003, SKDD.

[47]  Rens Bod Combining semantic and syntactic structure for language modeling , 2000, INTERSPEECH.

[48]  M. Tomita,et al.  An efficient word lattice parsing algorithm for continuous speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[49]  Chin-Hui Lee,et al.  Stochastic Representation of Conceptual Structure in the ATIS Task , 1991, HLT.

[50]  Michael Picheny,et al.  Semantic confidence measurement for spoken dialog systems , 2005, IEEE Transactions on Speech and Audio Processing.

[51]  Frederick Jelinek,et al.  Structured language modeling , 2000, Comput. Speech Lang..

[52]  Alexander I. Rudnicky,et al.  Error Handling in the RavenClaw Dialog Management Architecture , 2005, HLT/EMNLP.

[53]  Eugene Charniak,et al.  Immediate-Head Parsing for Language Models , 2001, ACL.

[54]  Richard M. Schwartz,et al.  Statistical Language Processing Using Hidden Understanding Models , 1994, HLT.

[55]  Wen Wang,et al.  Rescoring effectiveness of language models using different levels of knowledge and their integration , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[56]  Dilek Z. Hakkani-Tür,et al.  Semi-supervised learning for spoken language understanding semantic role labeling , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[57]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[58]  David L. Waltz,et al.  A Knowledge-Based Approach to Language Processing : A Progress Report , 2002 .

[59]  Björn Bringert,et al.  Speech Recognition Grammar Compilation in Grammatical Framework , 2007 .

[60]  David R. Dowty,et al.  Word Meaning and Montague Grammar , 1979 .

[61]  Donald E. Walker,et al.  The SRI speech understanding system , 1975 .

[62]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[63]  Roland Kuhn,et al.  A probabilistic approach to person-robot dialogue , 1991 .

[64]  Nanda Kambhatla,et al.  Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations , 2004, ACL 2004.

[65]  Andreas Stolcke,et al.  Using Conditional Random Fields for Sentence Boundary Detection in Speech , 2005, ACL.

[66]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[67]  Douglas E. Appelt,et al.  A Template Matcher for Robust NL Interpretation , 1991, HLT.

[68]  Wayne H. Ward,et al.  Towards Robust Semantic Role Labeling , 2007, CL.

[69]  Feng Gao,et al.  A spoken language understanding approach using successive learners , 2006, INTERSPEECH.

[70]  Charles J. Fillmore,et al.  THE CASE FOR CASE. , 1967 .

[71]  Ian R. Lane,et al.  Utterance verification incorporating in-domain confidence and discourse coherence measures , 2005, INTERSPEECH.

[72]  Bernhard Rüber,et al.  Context-dependent probability adaptation in speech understanding , 1997, Comput. Speech Lang..

[73]  Wayne H. Ward,et al.  Integrating semantic constraints into the Sphinx-II recognition search , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[74]  Ralf Engel,et al.  SPIN: language understanding for spoken dialogue systems using a production system approach , 2002, INTERSPEECH.

[75]  Roberto Pieraccini,et al.  Learning how to understand language , 1993, EUROSPEECH.

[76]  Rong Zhang,et al.  Improve latent semantic analysis based language model by integrating multiple level knowledge , 2002, INTERSPEECH.

[77]  Roberto Pieraccini,et al.  Learning associations between grammars: a new approach to natural language understanding , 1993, EUROSPEECH.

[78]  Giuseppe Riccardi,et al.  How may I help you? , 1997, Speech Commun..

[79]  Gökhan Tür,et al.  Beyond ASR 1-best: Using word confusion networks in spoken language understanding , 2006, Comput. Speech Lang..

[80]  Wayne H. Ward,et al.  Target Word Detection and Semantic Role Chunking using Support Vector Machines , 2003, NAACL.

[81]  James F. Allen Natural language understanding , 1987, Bejnamin/Cummings series in computer science.

[82]  Rohit J. Kate,et al.  Semi-Supervised Learning for Semantic Parsing using Support Vector Machines , 2007, NAACL.

[83]  Frédéric Béchet,et al.  Spoken Opinion Extraction for Detecting Variations in User Satisfaction , 2006, SLT.

[84]  Hermann Ney,et al.  Natural language understanding using statistical machine translation , 2001, INTERSPEECH.

[85]  Daniel Jurafsky,et al.  Towards better integration of semantic predictors in statistical language modeling , 1998, ICSLP.

[86]  J. Lambek The Mathematics of Sentence Structure , 1958 .

[87]  Yi-Chung Lin,et al.  Probabilistic concept verification for language understanding in spoken dialogue systems , 2001, INTERSPEECH.

[88]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[89]  Douglas E. Appelt,et al.  GEMINI: A Natural Language System for Spoken-Language Understanding , 1993, ACL.

[90]  William A. Woods,et al.  What's in a Link: Foundations for Semantic Networks , 1975 .

[91]  Hiroaki Kitano,et al.  Massively Parallel Memory-Based Parsing , 1991, IJCAI.

[92]  Robert C. Berwick,et al.  Principle-Based Parsing: Computation and Psycholinguistics , 1991 .

[93]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[94]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[95]  Ronald J. Brachman,et al.  ON THE EPISTEMOLOGICAL STATUS OF SEMANTIC NETWORKS , 1979 .

[96]  Andreas Stolcke,et al.  Dialogue act modeling for automatic tagging and recognition of conversational speech , 2000, CL.

[97]  Eduard H. Hovy,et al.  Performing Integrated Syntactic and Semantic Parsing Using Classification , 1990, HLT.

[98]  Jerome R. Bellegarda Large vocabulary speech recognition with multispan statistical language models , 2000, IEEE Trans. Speech Audio Process..

[99]  Alex Acero,et al.  Discriminative models for spoken language understanding , 2006, INTERSPEECH.

[100]  Daniel Jurafsky,et al.  Automatic Labeling of Semantic Roles , 2002, CL.

[101]  Stephanie Seneff A relaxation method for understanding spontaneous speech utterances , 1992 .

[102]  Enrique Vidal,et al.  Language understanding and subsequential transducer learning , 1993, Comput. Speech Lang..

[103]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[104]  Eugene Charniak,et al.  Assigning Function Tags to Parsed Text , 2000, ANLP.

[105]  Frédéric Béchet,et al.  Semantic interpretation with error correction , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[106]  Gökhan Tür,et al.  An active approach to spoken language processing , 2006, TSLP.

[107]  Jochen Peters,et al.  Semantic clustering for adaptive language modeling , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[108]  Alex Acero,et al.  Combination of CFG and n-gram modeling in semantic grammar learning , 2003, INTERSPEECH.

[109]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[110]  Dennis H. Klatt,et al.  Review of the ARPA speech understanding project , 1990 .

[111]  Sadaoki Furui,et al.  Robust methods in automatic speech recognition and understanding , 2003, INTERSPEECH.

[112]  Dilek Z. Hakkani-Tür,et al.  Active learning: theory and applications to automatic speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.

[113]  Kuansan Wang,et al.  A detection based approach to robust speech understanding , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[114]  Ye-Yi Wang,et al.  Spoken language understanding , 2005, IEEE Signal Processing Magazine.

[115]  Jerry R. Hobbs,et al.  Interpretation as Abduction , 1993, Artif. Intell..

[116]  Cheng Wu,et al.  Language model estimation for optimizing end-to-end performance of a natural language call routing system , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[117]  Jerome R. Bellegarda,et al.  Data-driven semantic inference for unconstrained desktop command and control , 2001, INTERSPEECH.

[118]  Robert C. Berwick,et al.  Principle-Based Parsing , 1987 .

[119]  James R. Glass,et al.  Confidence scoring for speech understanding systems , 1998, ICSLP.

[120]  Frédéric Béchet,et al.  Spoken Language Understanding Strategies on the France Telecom 3000 Voice Agency Corpus , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[121]  Jeff A. Bilmes,et al.  Dialog act tagging using graphical models , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[122]  Frédéric Béchet,et al.  On the use of finite state transducers for semantic interpretation , 2006, Speech Commun..

[123]  Daniel Jurafsky,et al.  Shallow Semantic Parsing using Support Vector Machines , 2004, NAACL.

[124]  Lucy Vanderwende,et al.  MindNet: Acquiring and Structuring Semantic Information from Text , 1998, COLING-ACL.

[125]  Richard Sproat,et al.  Creating a Finite-State Parser with Application Semantics , 2002, COLING.

[126]  Steve J. Young,et al.  Spoken language understanding using the Hidden Vector State Model , 2006, Speech Commun..

[127]  Franklin S. Cooper,et al.  Speech Understanding Systems , 1976, Artificial Intelligence.

[128]  Norihito Yasuda,et al.  Efficient spoken dialogue control depending on the speech recognition rate and system's database , 2003, INTERSPEECH.

[129]  Martha Palmer,et al.  Class-Based Construction of a Verb Lexicon , 2000, AAAI/IAAI.

[130]  Renato De Mori,et al.  The Application of Semantic Classification Trees to Natural Language Understanding , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[131]  Brian Roark,et al.  Probabilistic Top-Down Parsing and Language Modeling , 2001, CL.

[132]  Klaus Ries,et al.  HMM and neural network based speech act detection , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[133]  Salim Roukos,et al.  Maximum likelihood and discriminative training of direct translation models , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[134]  G. Tur,et al.  Model adaptation for spoken language understanding , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[135]  Gary Geunbae Lee,et al.  A multiple classifier-based concept-spotting approach for robust spoken language understanding , 2005, INTERSPEECH.

[136]  A. Stolcke,et al.  Dialog act modelling for conversational speech , 1998 .

[137]  Avi Pfeffer,et al.  Probabilistic Frame-Based Systems , 1998, AAAI/IAAI.

[138]  Srinivas Narayanan,et al.  Reasoning About Actions in Narrative Understanding , 1999, IJCAI.

[139]  Kunio Nakajima,et al.  A semantic interpretation based on detecting concepts for spontaneous speech understanding , 1994, ICSLP.

[140]  Jun Suzuki,et al.  Convolution Kernels with Feature Selection for Natural Language Processing Tasks , 2004, ACL.

[141]  Giorgio Satta,et al.  Optimal Probabilistic Evaluation Functions for Search Controlled by Stochastic Context-Free Grammars , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[142]  John McCarthy,et al.  SOME PHILOSOPHICAL PROBLEMS FROM THE STANDPOINT OF ARTI CIAL INTELLIGENCE , 1987 .

[143]  Günther Ruske,et al.  Estimation of semantic confidences on lattice hierarchies , 2004, INTERSPEECH.

[144]  Tatsuya Kawahara,et al.  Flexible Mixed-Initiative Dialogue Management using Concept-Level Confidence Measures of Speech Recognizer Output , 2000, COLING.

[145]  Wayne H. Ward,et al.  A concept graph based confidence measure , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[146]  Yorick Wilks,et al.  Automatic Natural Language Parsing , 1985 .

[147]  Steven L. Lytinen Semantics-First Natural Language Processing , 1991, AAAI.

[148]  King-Sun Fu,et al.  Syntactic Pattern Recognition And Applications , 1968 .

[149]  Emilio Sanchis Arnal,et al.  Continuous speech understanding based on automatic learning of acoustic and semantic models , 1994, ICSLP.

[150]  Michael F. McTear SPOKEN LANGUAGE UNDERSTANDING FOR CONVERSATIONAL DIALOG SYSTEMS , 2006, 2006 IEEE Spoken Language Technology Workshop.

[151]  David Stallard,et al.  Syntactic and Semantic Knowledge in the DELPHI Unification Grammar , 1990, HLT.

[152]  Michael Picheny,et al.  Using semantic analysis to improve speech recognition performance , 2005, Comput. Speech Lang..

[153]  Michael Picheny,et al.  A Comparison of Rule-Based and Statistical Methods for Semantic Language Modeling and Confidence Measurement , 2004, HLT-NAACL.

[154]  Giuseppe Riccardi,et al.  Generative and discriminative algorithms for spoken language understanding , 2007, INTERSPEECH.

[155]  Beth Ann Hockey,et al.  Putting Linguistics into Speech Recognition: The Regulus Grammar Compiler (Studies in Computational Linguistics (Stanford, Calif.).) , 2006 .

[156]  Timothy J. Hazen,et al.  Word and phone level acoustic confidence scoring , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[157]  Alessandro Moschitti,et al.  Spoken language understanding with kernels for syntactic/semantic structures , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[158]  Z. Harris,et al.  Foundations of language , 1941 .

[159]  Gökhan Tür,et al.  Prosody-based automatic segmentation of speech into sentences and topics , 2000, Speech Commun..

[160]  Chin-Hui Lee,et al.  Discriminative training of natural language call routers , 2003, IEEE Trans. Speech Audio Process..