Frequency selectivity via the SpEnt methodology for wideband speech compression

In speech and audio coding, frequency selectivity of the basis functions is an important property of the codec. The more precise the frequency selectivity, the less chance there is for audible coding effects due to uncanceled aliasing. We use Campbell's (1960) coefficient rate and the spectral entropy (SpEnt) of the source random process as a guide to formulate adaptive nonuniform modulated lapped biorthogonal transforms (NMLBT). The use of the NMLBT allows for efficient implementation of a time-varying transform which possesses both good frequency and time resolution at all instances, without the need for transitional filters. By coupling the SpEnt methodology with modulated lapped biorthogonal transforms (MLBT), we develop band combining strategies to produce an adaptive NMLBT. Due to the nature of the SpEnt methodology, the new frequency selection process comprises a non-linear approximation method to determine the best n basis functions to represent the current speech frame. We implement a wideband speech compression scheme based on this strategy and verify its improved performance in coding speech and audio signals at 16 and 24 kbps.

[1]  Martin Vetterli,et al.  Data Compression and Harmonic Analysis , 1998, IEEE Trans. Inf. Theory.

[2]  Richard V. Cox,et al.  The design of uniformly and nonuniformly spaced pseudoquadrature mirror filters , 1986, IEEE Trans. Acoust. Speech Signal Process..

[3]  M. Victor Wickerhauser,et al.  Adapted local trigonometric transforms and speech processing , 1993, IEEE Trans. Signal Process..

[4]  John Princen The design of nonuniform modulated filterbanks , 1995, IEEE Trans. Signal Process..

[5]  Ronald R. Coifman,et al.  Entropy-based algorithms for best basis selection , 1992, IEEE Trans. Inf. Theory.

[6]  Jerry D. Gibson,et al.  Variable rate speech coding based on subband measures of spectral flatness , 1995 .

[7]  Jerry D. Gibson,et al.  Spectral entropy, equivalent bandwidth and minimum coefficient rate , 1997, Proceedings of IEEE International Symposium on Information Theory.

[8]  J. Gibson,et al.  Coefficient rate and adaptive coding of side information , 1998 .

[9]  Allen Gersho,et al.  Variable rate speech coding with phonetic segmentation , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Henrique S. Malvar Biorthogonal and nonuniform lapped transforms for transform coding with reduced blocking and ringing artifacts , 1998, IEEE Trans. Signal Process..

[11]  L. Lorne Campbell,et al.  Minimum Coefficient Rate for Stationary Random Processes , 1960, Inf. Control..

[12]  Henrique S. Malvar Enhancing the performance of subband audio coders for speech signals , 1998, ISCAS '98. Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (Cat. No.98CH36187).

[13]  Byeong Gi Lee,et al.  A design of nonuniform cosine modulated filter banks , 1995 .