An approach to computer speech recognition by direct analysis of the speech wave

A system for obtaining a phonemic transcription from a connected speech sample entered into the computer by a microphone and an analog‐to‐digital converter is described. The following features of the system are believed to be new: direct input of the speech signal to the computer without filters or spectrographs; the procedures for segmentation and pitch extraction; the procedure for prosodic parameter determination; and many procedures for phoneme classification. About 30 sounds of 1‐ to 2‐sec duration were analyzed on an IBM‐7090‐PDP1 disk system. Correct identification of many vowel and consonanted phonemes was achieved for a single cooperative speaker. The time for analysis of each sound varied from 40 to 75 sec. For example, the sentence, “John has a book,” resulted in a phoneme string output “J AA M AE Z EH (B D G) U K.” The results encourage continuing the approach with a more powerful computer to achieve real‐time recognition.