Experimental comparison between stationary and nonstationary formulations of linear prediction applied to voiced speech analysis

The purpose of this paper is to present the theoretical differences and results of experimental comparison of the stationary (autocorrelation) and nonstationary (covariance) linear prediction formulations when applied to voiced speech analysis. In this experimental study three criterion used for comparison purposes are: 1) total minimum normalized squared error, 2) accuracy in estimating speech spectrum, and 3) accuracy in estimating formant parameters. The results of linear prediction pitch synchronous as well as pitch asynchronous analyses of synthetic and natural speech are given. Influence of analysis segment size and its position on the estimated formant parameters and total minimum normalized squared error have been investigated. For pitch synchronous analysis, nonstationarity is a better assumption than stationarity, but for pitch asynchronous analysis and large analysis segment size (20-25 ms) the performance of both formulations in representing the speech waveform is practically the same.