Improved modulation spectrum through multi-scale modulation frequency decomposition

The modulation spectrum is a promising method to incorporate dynamic information in pattern classification. It contains important cues about the nonstationary content of a signal and yields complementary improvements when it is combined with conventional features derived from short-term analysis. Many prior modulation spectrum approaches are based on uniform modulation frequency decomposition. The drawbacks of these approaches are high dimensionality and a lack of a connection to human perception of modulation. The paper presents multi-scale modulation frequency decomposition and shows an improvement over the standard modulation spectrum in a digital communication signal classification task. Features derived from this representation provide lower classification error rates than those from a constant-bandwidth modulation spectrum, whether used alone or in combination with short-term features.

[1]  Nevio Benvenuto A speech/voiceband data discriminator , 1993, IEEE Trans. Commun..

[2]  Samir S. Soliman,et al.  Automatic modulation classification using zeroç crossing , 1990 .

[3]  T. Dau,et al.  Characterizing frequency selectivity for envelope fluctuations. , 2000, The Journal of the Acoustical Society of America.

[4]  Nima Mesgarani,et al.  Speech discrimination based on multiscale spectro-temporal modulations , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Hervé Bourlard,et al.  Mel-cepstrum modulation spectrum (MCMS) features for robust ASR , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[6]  Ben P. Milner,et al.  Inclusion of temporal information into features for speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7]  Elsayed Elsayed Azzouz,et al.  Algorithms for automatic modulation recognition of communication signals , 1998, IEEE Trans. Commun..

[8]  Hynek Hermansky,et al.  Should recognizers have ears? , 1998, Speech Commun..

[9]  Khaled H. Hamed,et al.  Time-frequency analysis , 2003 .

[10]  Samir S. Soliman,et al.  Signal classification using statistical moments , 1992, IEEE Trans. Commun..

[11]  T. Houtgast Frequency selectivity in amplitude-modulation detection. , 1989, The Journal of the Acoustical Society of America.

[12]  Steven Greenberg,et al.  Robust speech recognition using the modulation spectrogram , 1998, Speech Commun..

[13]  William A. Gardner,et al.  Statistical spectral analysis : a nonprobabilistic theory , 1986 .

[14]  Bruce F. Cockburn,et al.  Voiceband signal classification using statistically optimal combinations of low-complexity discriminant variables , 1999, IEEE Trans. Commun..