Speech compression with cosine and wavelet packet near-best bases

Compression of speech from the TIMIT corpus was investigated for several transform domain methods coding near-best and best bases from cosine and wavelet packet transforms. Satisficing (suboptimizing) search algorithms for selecting near-best bases were compared with optimizing algorithms for best bases in these adaptive tree-structured transforms. Experiments were performed on several hundred seconds of speech spoken by both male and female speakers from all dialect regions of the TIMIT corpus. Near-best bases provided rate-distortion performance effectively as good as that of best bases but without the additional computational penalty. Cosine packet bases outperformed wavelet packet bases.

[1]  R. Crochiere,et al.  Speech Coding , 1979, IEEE Transactions on Communications.

[2]  Carl Taswell,et al.  Satisficing search algorithms for selecting near-best bases in adaptive tree-structured wavelet transforms , 1996, IEEE Trans. Signal Process..

[3]  Lawrence R. Rabiner,et al.  Applications of voice processing to telecommunications , 1994, Proc. IEEE.

[4]  Allen Gersho,et al.  Advances in speech and audio compression , 1994, Proc. IEEE.

[5]  Ronald A. DeVore,et al.  Image compression through wavelet transform coding , 1992, IEEE Trans. Inf. Theory.

[6]  Carl Taswell Near-best basis selection algorithms with non-additive information cost functions , 1994, Proceedings of IEEE-SP International Symposium on Time- Frequency and Time-Scale Analysis.

[7]  Ronald W. Schafer,et al.  Digital Processing of Speech Signals , 1978 .

[8]  Carl Taswell Image compression by parameterized-model coding of wavelet packet near-best bases , 1995, Defense, Security, and Sensing.

[9]  P. Noll,et al.  Wideband speech and audio coding , 1993, IEEE Communications Magazine.

[10]  Carl Taswell,et al.  Top-Down and Bottom-Up Tree Search Algorithms for Selecting Bases in Wavelet Packet Transforms , 1995 .

[11]  Christopher M. Brislawn,et al.  FBI wavelet/scalar quantization standard for gray-scale fingerprint image compression , 1993, Defense, Security, and Sensing.

[12]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .