This work is mainly focused on showing experimental results of speaker recognition with voice activity detection. A VAD algorithm based on the finite state machine is introduced firstly. The algorithm is incorporated into two speaker recognition (SR)systems. The Mel Frequency Ceptral Coefficients(MFCCs) are adopted as the speaker speech feature parameters in both systems. Vector quantization (VQ)and Gaussian mixture model (GMM) are the classifiers of the two SR systems, respectively. The experimental results show that the VAD improved the performance of both SR systems with small speech database. However, as the speech databases get bigger and bigger, the performance of both SR systems withVAD gets worse and worse, compared to those of systems without VAD. The reason of the phenomenon is analyzed in detail.
[1]
Jr. J.P. Campbell,et al.
Speaker recognition: a tutorial
,
1997,
Proc. IEEE.
[2]
M.G. Bellanger,et al.
Digital processing of speech signals
,
1980,
Proceedings of the IEEE.
[3]
Sadaoki Furui,et al.
Recent advances in speaker recognition
,
1997,
Pattern Recognit. Lett..
[4]
Wang Bing-wen.
Endpoint detection of Chinese digital speech based on finite state machine
,
2004
.
[5]
G.R. Doddington,et al.
Speaker recognition—Identifying people by their voices
,
1985,
Proceedings of the IEEE.
[6]
S. Gökhun Tanyer,et al.
Voice activity detection in nonstationary noise
,
2000,
IEEE Trans. Speech Audio Process..
[7]
Jean Monné,et al.
Speech/non-speech detection for voice response systems
,
1993,
EUROSPEECH.