A comparison of DFT, PLP and cochleagram for alphabet recognition

The authors used speaker-independent classification of isolated English letters to evaluate the relative performance of the discrete Fourier transform, perceptual linear predictive (PLP) analysis, and the cochleagram auditory model. Feedforward neural network classifiers were trained using all three representations on 60 speakers and tested on 60 new speakers. Training and testing data were independently modified by adding two levels of Gaussian noise and babble (20 random letter utterances, attenuated and given random offsets). PLP gave the best results, especially when trained or tested on Gaussian noise.<<ETX>>

[1]  Ron Cole,et al.  The ISOLET spoken letter database , 1990 .

[2]  Victor Zue,et al.  A comparative study of acoustic representations of speech for vowel classification using multi-layer perceptrons , 1990, ICSLP.

[3]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[4]  R.A. Cole,et al.  Speaker-independent vowel recognition: spectrograms versus cochleagrams , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[5]  Ronald A. Cole,et al.  Speaker-independent recognition of spoken English letters , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[6]  M. Hunt,et al.  Speaker dependent and independent speech recognition experiments with an auditory model , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[7]  Richard F. Lyon,et al.  A computational model of filtering, detection, and compression in the cochlea , 1982, ICASSP.