Channel compensation of modulation spectral features

We propose a new channel compensation method for modulation spectral features. We compare our proposed method, subband normalization, with a more traditional method, cepstral mean subtraction (CMS). Experimental results show that subband normalized modulation scale features provide advantages over CMS features. The proposed method is not only robust to slowly varying convolutional noise, but also to time-scale modification and time misalignment. CMS is not robust to these time distortions. We discuss the theory of estimating a modulation scale representation and its channel compensation. Audio identification is used for experimental verification. Simulation results on a large database show that the proposed method provides a high accuracy in spite of convolutional noise and time distortions.

[1]  T. Dau,et al.  Characterizing frequency selectivity for envelope fluctuations. , 2000, The Journal of the Acoustical Society of America.

[2]  Khaled H. Hamed,et al.  Time-frequency analysis , 2003 .

[3]  Thomas Quatieri,et al.  Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .

[4]  Les E. Atlas,et al.  Modulation frequency and efficient audio coding , 2001, SPIE Optics + Photonics.

[5]  Les E. Atlas,et al.  Modulation frequency features for audio fingerprinting , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  W. Gardner Exploitation of spectral redundancy in cyclostationary signals , 1991, IEEE Signal Processing Magazine.

[7]  Hynek Hermansky,et al.  Temporal patterns (TRAPs) in ASR of noisy speech , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[8]  R. P. Ramachandran,et al.  Robust speaker recognition: a feature-based approach , 1996, IEEE Signal Processing Magazine.

[9]  Juan Carlos,et al.  Review of "Discrete-Time Speech Signal Processing - Principles and Practice", by Thomas Quatieri, Prentice-Hall, 2001 , 2003 .