Waveform substitution techniques for recovering missing speech segments in packet voice communications

Packet communication systems cannot, in general, guarantee accurate and prompt delivery of every packet. The effect of network congestion and transmission impairments on data packets is extended delay; in voice communications these problems lead to lost packets. When some speech packets are not available, the simplest response of a receiving terminal is to substitute silence for the missing speech. Here, we explore techniques for replacing missing speech with wave-form segments from correctly received packets in order to increase the maximum tolerable missing packet rate. After presenting a simple formula for predicting the probability of waveform substitution failure as a function of packet duration and packet loss rate, we introduce two techniques for selecting substitution waveforms. One method is based on pattern matching and the other technique explicitly estimates voicing and pitch. Both approaches achieve substantial improvements in speech quality relative to silence substitution. After waveform substitution, a significant component of the perceived distortion is due to discontinuities at packet boundaries. To reduce this distortion, we introduce a simple smoothing procedure.