The Effectiveness of Corpus-Induced Dependency Grammars for Post-processing Speech

This paper investigates the impact of Constraint Dependency Grammars (CDG) on the accuracy of an integrated speech recognition and CDG parsing system. We compare a conventional CDG with CDGs that are induced from annotated sentences and template-expanded sentences. The grammars are evaluated on parsing speed, precision/coverage, and improvement of word and sentence accuracy of the integrated system. Sentence-derived CDGs significantly improve recognition accuracy over the conventional CDG but are less general. Expanding the sentences with templates provides us with a mechanism for increasing the coverage of the grammar with only minor reductions in recognition accuracy.

[1]  B. Srinivas "Almost parsing" technique for language modeling , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[2]  James Glass,et al.  Integration of speech recognition and natural language processing in the MIT VOYAGER system , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[3]  Pascale Fung,et al.  The estimation of powerful language models from small and large corpora , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Wolfgang Menzel,et al.  Parsing of Spoken Language under Time Constraints , 1994, ECAI.

[5]  Mary P. Harper,et al.  NEAR MINIMAL WEIGHTED WORD GRAPHS FOR POST-PROCESSING SPEECH , 1999 .

[6]  Stephanie Seneff,et al.  TINA: A Natural Language System for Spoken Language Applications , 1992, Comput. Linguistics.

[7]  Gareth J. F. Jones,et al.  A robust language model incorporating a substring parser and extended n-grams , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Balas K. Natarajan,et al.  On learning sets and functions , 2004, Machine Learning.

[9]  Hiroshi Maruyama Constraint Dependency Grammar and Its Weak Generative Capacity , 1992 .

[10]  Wolfgang Menzel,et al.  Robust Processing of Natural Language , 1995, KI.

[11]  Mary P. Harper,et al.  Rapid grammar development and parsing: constraint dependency grammars with abstract role values , 2000 .

[12]  Steve Young,et al.  Token passing: a simple conceptual model for connected speech recognition systems , 1989 .

[13]  Hiroshi Maruyama,et al.  Structural Disambiguation With Constraint Propagation , 1990, ACL.

[14]  Mary P. Harper,et al.  MUSE CSP: An Extension to the Constraint Satisfaction Problem , 1996, J. Artif. Intell. Res..

[15]  Mary P. Harper,et al.  Interfacing a CDG parser with an HMM word recognizer using word graphs , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[16]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[17]  Patti Price,et al.  The DARPA 1000-word resource management database for continuous speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[18]  Roberto Pieraccini,et al.  Stochastic representation of semantic structure for speech understanding , 1991, Speech Commun..

[19]  Mary P. Harper,et al.  Extensions to constraint dependency parsing for spoken language processing , 1995, Comput. Speech Lang..

[20]  Herbert Gish,et al.  Reducing word error rate on conversational speech from the Switchboard corpus , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[21]  Steve Young,et al.  The HTK book , 1995 .

[22]  Hans Ulrich Block The language components in Verbmobil , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[23]  L.A. Schmid Parsing word graphs using a linguistic grammar and a statistical language model , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[24]  Mary P. Harper,et al.  Enhanced Constraint Dependency Grammar Parsers , 1998 .

[25]  Hermann Ney,et al.  On structuring probabilistic dependences in stochastic language modelling , 1994, Comput. Speech Lang..