Auditory spectrum based features (ASBF) for robust speech recognition
暂无分享,去创建一个
MFCC are features commonly used in speech recognition systems today. The recognition accuracy of systems using MFCC is known to be high in clean speech environment, but it drops greatly in noisy environment. In this paper, we propose new features called the auditory spectrum based features (ASBF) that are based on the cochlear model of the human auditory system. These new features can track the formants and the selection scheme of these features is based on the second order difference cochlear model and the primary auditory nerve processing model. In our experiment, the performance of MFCC and the ASBF are compared in clean and noisy environments. The results suggest that the ASBF are much more robust to noise than MFCC.
[1] Kuldip K. Paliwal. Spectral subband centroids as features for speech recognition , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[2] Oscar C. Au,et al. A novel approach of low bit-rate speech coding based on sinusoidal representation and auditory model , 1999, EUROSPEECH.
[3] James M. Kates,et al. A time-domain digital cochlear model , 1991, IEEE Trans. Signal Process..
[4] Aaron E. Rosenberg,et al. An improved endpoint detector for isolated word recognition , 1981 .