Statistical properties of speech signals

Statistical information is needed concerning the important properties of speech signals so that problems, such as those of amplifier loading, quantisation, companding, voice switching and echo suppression can be properly analysed. The following headings are used: Conversation statistics. The participants emit speech in relatively short utterances, averaging about 1 sec in duration, and alternate talking and listening roles some 15 times per minute. Volume distribution. The mean power of speech signals on a telephone circuit is characteristic of the person talking under the particular conditions that apply; the mean power while the talker is active, expressed in decibels, is termed `volume'. Short-term mean power. Speech is generated in the vocal organs by a modulation process, and the resulting sound-pressure waveforms can be treated as having a slowly varying (predominantly six cycles per second) `modulation' component, impressed on a much higher frequency (200-8000c/s) `carrier' component. Mean power averaged over the duration of a short syllable (say 20ms) has, for a given talker while active, a distribution that can be approximated by a mathematical expression. Instantaneous voltage distribution. The distribution of instantaneous voltage of speech at a given volume can be approximated by a suitable mathematical expression, which depends upon the characteristics of the microphone involved.