An Improved Method for Speech Enhancement Based on Human Auditory Masking Properties

This paper suggests an improved speech enhancement method based on human auditory masking properties. In this method, an improved noise estimation algorithm is used, then, the estimated results can be used to compute the masking threshold of the speech, after that, the coefficients of time and frequency are adjusted according to perception. The results show that the improved method leads to better signal to noise ratio, significant reduction of background noise, and unnatural structure of the residual noise.

[1]  I. Cohen,et al.  Noise estimation by minima controlled recursive averaging for robust speech enhancement , 2002, IEEE Signal Processing Letters.

[2]  S. Khalfa,et al.  Event-related skin conductance responses to musical emotions in humans , 2002, Neuroscience Letters.

[3]  Yi Hu,et al.  A noise estimation algorithm with rapid adaptation for highly nonstationary environments , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Rainer Martin,et al.  Spectral Subtraction Based on Minimum Statistics , 2001 .

[5]  S. Koelsch Investigating Emotion with Music , 2005, Annals of the New York Academy of Sciences.

[6]  P. Ekman,et al.  Facial signs of emotional experience. , 1980 .

[7]  T. Baumgartner,et al.  From emotion perception to emotion experience: emotions evoked by pictures and classical music. , 2006, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[8]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..