Complex Wavelet Modulation Subbands for Speech Compression

Low-frequency modulation of sound carry essential information for speech and music. They must be preserved for compression. The complex modulation spectrum is commonly obtained by spectral analysis of the sole temporal envelopes of the subbands out of a time/frequency analysis. Amplitudes and tones of speech or music tend to vary slowly over time thus the temporal envelopes are mostly of polynomial type. Processing in this domain usually creates undesirable distortions because only the magnitudes are taken into account and the phase data is often neglected. We remedy this problem with the use of a complex wavelet transform as a more appropriate envelope and phase processing tool. Complex wavelets carry both magnitude and phase explicitly with great sparsity and preserve well polynomials. Moreover an analytic Hilbert-like transform is possible with complex wavelets implemented as an orthogonal filter bank. By working in this alternative transform domain coined as ``Modulation Subbands", this transform shows very promising compression capabilities thanks to interesting sparsity properties and suggests new approaches for joint spectro-temporal analytic processing of slow frequency and phase varying audio signals.

[1]  S. H. Jensen,et al.  Complex Wavelet Modulation Sub-Bands and Speech , 2008 .

[2]  Les E. Atlas,et al.  Scalable and progressive audio codec , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[3]  Steven Greenberg,et al.  ON THE ORIGINS OF SPEECH INTELLIGIBILITY IN THE REAL WORLD , 1997 .