论文信息 - Speech recognition of specific two-word Chinese vocabulary by applying Fourier transform twice to the broad-band spectrogram

Speech recognition of specific two-word Chinese vocabulary by applying Fourier transform twice to the broad-band spectrogram

This paper illustrates a method to recognize the speech of specific two-word Chinese vocabulary by analyzing speech signals using a broad-band spectrogram after Fourier transform is applied to it twice. First, we analyze the broad-band spectrogram in the frequency domain and its corresponding voice characteristics in detail after applying Fourier transform twice. Then, binary width zoning column projection is carried out in the broad-band spectrogram frequency domain. The projection value is treated as the characteristic value of speech recognition feature and the support vector machine (SVM) is considered as the classifier for recognizing the speech of specific two-word Chinese vocabulary. A total of 1000 voice samples were used in the simulation. The results using this method show a remarkable recognition rate of 93.4%. The proposed method provides a new way for vocabulary recognition.

Ying Wei | Tingfa Xu | Di Pan | Shuangwei Wang | Shili Liang

[1] T D Carrell,et al. Recognition of speech spectrograms. , 1984, The Journal of the Acoustical Society of America.

[2] Paul Mermelstein,et al. Speech recognition through spectrogram matching , 1974 .

[3] Ben Pinkowski. Principal component analysis of speech spectrogram images , 1997, Pattern Recognit..

[4] Douglas D. O'Shaughnessy. Efficient spectral measures for automatic speech recognition , 2007 .

[5] Steven Greenberg,et al. The modulation spectrogram: in pursuit of an invariant representation of speech , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6] Chidchanok Lursinsap,et al. Singing voice recognition based on matching of spectrogram pattern , 2009, 2009 International Joint Conference on Neural Networks.

[7] Xueying Zhang,et al. Evaluation of a set of new ORF kernel functions of SVM for speech recognition , 2013, Eng. Appl. Artif. Intell..

[8] Xu Zhao,et al. A Mathematical Morphological Processing of Spectrograms for the Tone of Chinese Vowels Recognition , 2014 .

[9] Mathew J. Palakal,et al. Feature extraction from speech spectrograms using multi-layered network models , 1989, [Proceedings 1989] IEEE International Workshop on Tools for Artificial Intelligence.

[10] M. Bodruzzaman,et al. Speech recognition using pulse coupled neural network , 1998, Proceedings of Thirtieth Southeastern Symposium on System Theory.

[11] Abderrahmane Amrouche,et al. New scheme based on GMM-PCA-SVM modelling for automatic speaker recognition , 2014, Int. J. Speech Technol..