Modeling of english speech for the design of a distributed speech understanding system

This paper describes the derivation and verification of a phoneme model of English speech. The model is used to generate a stream of phonemically labeled speech frames to model speech input for the design of a distributed speech understanding system. New computer architectures to perform speech understanding in real time should incorporate information about the characteristics of English speech. In order to predict the performance of a new architecture, it is necessary to simulate the design using either massive amounts of speech data or, as an alternative, a statistical model of speech. A statistically generated phoneme stream is used to avoid the difficulty of performing computationally intensive acoustic parameterization on the enormous amount of speech input data which would be required to obtain representative phoneme distributions and patterns of speech.