Automated Speech Discrimination using Frequency Derivative Threshold Detection

Traditionally the discrimination of speech from non-speech signals within audio has been done through the use of amplitude threshold levels or energy levels within the audio signals. While these methods along with other newer methods have been shown to provide effective results in discriminating speech, this paper aims at taking a different perspective in speech discrimination. In general the variations in frequency of a person's voice over small time periods are within a limited range. In contrast many undesirable background noises tend to have highly varying frequency levels. Hence by analysing the change in frequency over time an effective means can be found for identifying what is human speech and what is noise. This paper aims at using this form of speech discrimination in order to develop an efficient method for isolating speech from non-speech signals.

[1]  Yunxin Zhao,et al.  Co-channel speech separation for robust automatic speech recognition: stability and efficiency , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  K.-C. Wang,et al.  Robust endpoint detection algorithm based on the adaptive band-partitioning spectral entropy in adverse environments , 2005, IEEE Transactions on Speech and Audio Processing.

[3]  Steve Graham,et al.  An Automatic Transcriber of Meetings Utilising Speech Recognition Technology , 2004 .

[4]  Géraldine Damnati,et al.  Robust speech/non-speech detection using LDA applied to MFCC , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[5]  Rathinavelu Chengalvarayan,et al.  Robust energy normalization using speech/nonspeech discriminator for German connected digit recognition , 1999, EUROSPEECH.

[6]  Juraj Kacur,et al.  Speech detection in the noisy environment using wavelet transform , 2003, Proceedings EC-VIP-MC 2003. 4th EURASIP Conference focused on Video/Image Processing and Multimedia Communications (IEEE Cat. No.03EX667).

[7]  L. Sellami,et al.  Speech coding and phoneme classification using MATLAB and NeuralWorks , 1997, Proceedings Frontiers in Education 1997 27th Annual Conference. Teaching and Learning in an Era of Change.

[8]  Steven W. Smith,et al.  The Scientist and Engineer's Guide to Digital Signal Processing , 1997 .

[9]  Ahmad Hashemi-Sakhtsari,et al.  Performance evaluation of an automatic speech recogniser incorporating a fast adaptive speech separation algorithm , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..