Classification of Stop Consonants using Modulation Spectrogram-Based Features

In this paper, we propose use of modulation spectrogram-based features for stop consonants classification based on their place of articulation. Stop sounds are classified as bilabial, alveolar and velar according to their place of articulation. The modulation spectrogram which is a two- dimensional (i.e., 2-D) feature represents modulation of low frequency components with acoustic frequency. In this work, modulation spectrogram has been obtained for all stop consonants from TIMIT database and then a dimension reduction algorithm, viz., higher order singular value decomposition (HOSVD) is applied on the feature vectors. The reduced dimension feature set is then applied to a Support Vector Machine (SVM) classifier which gives an overall accuracy of 94.25% for stop classification and 95.29% for place of articulation classification.

[1]  Jan Van der Spiegel,et al.  Robust classification of stop consonants using auditory-based speech processing , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[2]  Victor Zue,et al.  Selecting acoustic features for stop consonant identification , 1983, ICASSP.

[3]  H. Hermansky,et al.  The modulation spectrum in the automatic recognition of speech , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[4]  Yannis Stylianou,et al.  Modulation spectral features for objective voice quality assessment , 2010, 2010 4th International Symposium on Communications, Control and Signal Processing (ISCCSP).

[5]  Yannis Stylianou,et al.  Voice Pathology Detection and Discrimination Based on Modulation Spectral Features , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Jan Van der Spiegel,et al.  Acoustic‐phonetic features for the automatic recognition of stop consonants , 1998 .

[7]  Hynek Hermansky,et al.  Phoneme recognition using spectral envelope and modulation frequency features , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Hynek Hermansky,et al.  Temporal envelope compensation for robust phoneme recognition using modulation spectrum. , 2010, The Journal of the Acoustical Society of America.

[9]  Jan Van der Spiegel,et al.  Acoustic-phonetic features for the automatic classification of stop consonants , 2001, IEEE Trans. Speech Audio Process..

[10]  Steven Greenberg,et al.  The modulation spectrogram: in pursuit of an invariant representation of speech , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Mark Hasegawa-Johnson,et al.  Stop consonant classification by dynamic formant trajectory , 2004, INTERSPEECH.

[12]  Hynek Hermansky,et al.  Modulation frequency features for phoneme recognition in noisy speech. , 2009, The Journal of the Acoustical Society of America.

[13]  Les E. Atlas,et al.  EURASIP Journal on Applied Signal Processing 2003:7, 668–675 c ○ 2003 Hindawi Publishing Corporation Joint Acoustic and Modulation Frequency , 2003 .

[14]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[15]  Maria Markaki,et al.  Using modulation spectra for voice pathology detection and classification , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[16]  Alejandro Murua,et al.  Classification and clustering of stop consonants via nonparametric transformations and wavelets , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.