"Almost parsing" technique for language modeling

We present an approach that incorporates structural information into language models without really parsing the utterance. This approach brings together the advantages of an n-gram language model-speed, robustness and the ability to integrate with the speech recognizer with the need to model syntactic constraints under a uniform representation. We also show that our approach produces better language models than language models based on part-of-speech tags.

[1]  Thomas Niesler,et al.  A variable-length category-based n-gram language model , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[2]  John Lafferty,et al.  Grammatical Trigrams: A Probabilistic Model of Link Grammar , 1992 .

[3]  James Glass,et al.  Integration of speech recognition and natural language processing in the MIT VOYAGER system , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[4]  I. Good THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS , 1953 .

[5]  Aravind K. Joshi,et al.  Tree Adjunct Grammars , 1975, J. Comput. Syst. Sci..

[6]  NeyHermann,et al.  On the Estimation of 'Small' Probabilities by Leaving-One-Out , 1995 .

[7]  Aravind K. Joshi,et al.  Parsing Strategies with ‘Lexicalized’ Grammars: Application to Tree Adjoining Grammars , 1988, COLING.

[8]  Philip Resnik,et al.  Probabilistic Tree-Adjoining Grammar as a Framework for Statistical Natural Language Processing , 1992, COLING.

[9]  Michael Riley,et al.  Speech Recognition by Composition of Weighted Finite Automata , 1996, ArXiv.

[10]  Srinivas Bangalore,et al.  The Institute For Research In Cognitive Science Disambiguation of Super Parts of Speech ( or Supertags ) : Almost Parsing by Aravind , 1995 .

[11]  Yves Schabes,et al.  Stochastic Lexicalized Tree-adjoining Grammars , 1992, COLING.

[12]  Hermann Ney,et al.  On the Estimation of 'Small' Probabilities by Leaving-One-Out , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[14]  F. Jelinek,et al.  Perplexity—a measure of the difficulty of speech recognition tasks , 1977 .

[15]  Kenji Kita,et al.  Incorporating LR parsing into SPHINX , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.