A real-time isolated word recognizer for telephone input

We describe a new real-time isolated word recognizer with improved user interface. The recognizer is designed for an Extension Number Guidance System which looks up and announces an extension number by telephone dialogue with users. To deal with telephone quality speech input which include noise and distortion during transmission over the telephone network, we developed feature extraction and a word detection algorithm. These techniques use wide band-pass filter outputs which are generally employed to decide whether speech is voiced or unvoiced. To achieve a friendly interface, the system can accept user input at any time by using an echo canceler and the new word detection algorithm. Finally, the recognizer is evaluated using a large telephone voice database consisting of more than 500 speakers.

[1]  S. Hayamizu,et al.  Experimental studies on the connected-words recognition using continuous dynamic programming , 1984 .

[2]  Sadaoki Furui,et al.  Speaker-independent isolated word recognition using dynamic features of speech spectrum , 1986, IEEE Trans. Acoust. Speech Signal Process..

[3]  T. Martin,et al.  On the effects of varying filter bank parameters on isolated word recognition , 1982 .

[4]  Biing-Hwang Juang,et al.  A family of distortion measures based upon projection operation for robust speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[5]  J. G. Wilpon,et al.  An improved word-detection algorithm for telephone-quality speech incorporating both syntactic and semantic constraints , 1984, AT&T Bell Laboratories Technical Journal.

[6]  J. Gowdy,et al.  A speaker-independent speech-recognition system based on linear prediction , 1978 .

[7]  Joseph Picone The demographics of speaker independent digit recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[8]  Kiyohiro Shikano,et al.  Isolated word recognition using phoneme-like templates , 1983, ICASSP.

[9]  Michael Picheny,et al.  Speech recognition using noise-adaptive prototypes , 1989, IEEE Trans. Acoust. Speech Signal Process..

[10]  Norio Higuchi,et al.  Extension number guidance system , 1990, ICSLP.

[11]  John Makhoul,et al.  Spectral linear prediction: Properties and applications , 1975 .

[12]  J.G. Wilpon,et al.  Isolated word recognition over the DDD telephone network. Results of two extensive field studies , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.