In this paper we describe a speaker trained, voice controlled, repertory dialer system. The main elements of the system include: 1. A real-time speech analyzer that detects the presence of speech on the input line, and analyzes the speech to give features appropriate for a word recognizer. 2. An isolated word recognizer that decides which of a set of words was spoken. 3. A voice response system to provide spoken commands to the user to guide the use of the repertory dialer system. 4. A dialer (simulated) to outpulse the desired telephone number. The repertory dialer system is implemented on a minicomputer with a high speed array processor performing the real-time operations. The vocabulary for the system consists of 7 command words, 10 digits, and any number of names up to some specified maximum Recognition is performed on one or more subsets of the vocabulary, depending on fine state of the system. To train the system the user is requested to speak each of the vocabulary words twice to provide reference templates for the system. Following training, the system can dial the telephone number corresponding to any name in the repertory, or it can dial a 4 digit telephone extension spoken as an isolated string of digits. The system was tested extensively by 6 talkers (3 male, 3 female - 3 of whom were naive and 3 experienced users) over a three week period. A total of 4620 words were spoken and during the course of the test there were no recognition errors. A request for a repeat of a spoken word occurred about 2% of the time. These tests demonstrate the reliability and robustness of this voice repertory dialer system.
[1]
T.B. Martin,et al.
Practical applications of voice input to machines
,
1976,
Proceedings of the IEEE.
[2]
James L. Flanagan,et al.
Adaptive quantization in differential PCM coding of speech
,
1973
.
[3]
Aaron E. Rosenberg,et al.
Evaluation of a word recognition system using syntax analysis
,
1977
.
[4]
R.W. Schafer,et al.
Digital techniques for computer voice response: Implementations and applications
,
1976,
Proceedings of the IEEE.
[5]
A. E. Rosenberg,et al.
Automatic recognition of spoken spelled names for obtaining directory listings
,
1979,
The Bell System Technical Journal.
[6]
B. Atal,et al.
Speech analysis and synthesis by linear prediction of the speech wave.
,
1971,
The Journal of the Acoustical Society of America.
[7]
F. Itakura,et al.
Minimum prediction residual principle applied to speech recognition
,
1975
.