Computer recognition of the continuant phonemes in connected english speech

A method of phoneme recognition of connected speech is described. Input to the system is assumed to consist of the 24 continuant phonemes in connected English speech. The system first categorizes each successive 20-ms segment of the input speech utterance as either voiced fricative, voiced nonfricative, unvoiced fricative or no-speech, Utilizing a measure of the relative energy balance between low and high frequencies. Next, the recognition of each 20-ms segment is performed from a distribution of axis-crossing intervals of speech prefiltered to emphasize each formant frequency range. Segmentation is performed from the results of the recognition of each 20-ms segment and from changes in categorization. Finally, the results of the recognition of each 20-ms segment between each pair of segmentation boundaries are combined and the phonemic sound occurring most frequently is printed out. The system has been trained for a single male speaker. Preliminary results for this speaker and for four 3-4-s sentences indicate: a correct categorization decision for about 97 percent of the input 20-ms segments, a correct recognition for about 78 percent of the input 20-ms segments, and an overall correct phoneme recognition for about 87 percent of the input phonemes.

[1]  Shuji Doshita,et al.  The Automatic Speech Recognition System for Conversational Sound , 1963, IEEE Trans. Electron. Comput..

[2]  D. Raj Reddy,et al.  A Procedure for the Segmentation of Connected Speech , 1968 .

[3]  J. Forgie,et al.  Results Obtained from a Vowel Recognition Computer Program , 1959 .

[4]  I. Pollack,et al.  Effects of Differentiation, Integration, and Infinite Peak Clipping upon the Intelligibility of Speech , 1948 .

[5]  J. Smith,et al.  Vowel Recognition Using a Multiple Discriminant Function , 1961 .

[6]  R. Scarr Zero crossings as a means of obtaining spectral information in speech analysis , 1968 .

[7]  P. Denes,et al.  The Solution of Some Fundamental Problems in Mechanical Speech Recognition , 1958 .

[8]  D. Reddy Computer recognition of connected speech. , 1967, The Journal of the Acoustical Society of America.

[9]  Ian B. Thomas,et al.  The Influence of First and Second Formants on the Intelligibility of Clipped Speech , 1968 .

[10]  A. House On Vowel Duration in English , 1961 .

[11]  P. Denes,et al.  The design and operation of the mechanical speech recognizer at University College London , 1959 .

[12]  R. Donaldson,et al.  Zero-crossing measurements for analysis and recognition of speech sounds , 1971 .

[13]  L. Gerstman Classification of self-normalized vowels , 1968 .

[14]  R. J. Niederjohn,et al.  A preliminary analysis technique for speech sound classification , 1969 .

[15]  W. Bezdel,et al.  Results of an analysis and recognition of vowels by computer using zero-crossing data , 1965 .

[16]  D. Reddy Segmentation of Speech Sounds , 1966 .