Research on Speaker Recognition Based on Multifractal Spectrum Feature

In this paper, a new nonlinear feature extraction method based on the WTMM (wavelet transform modulus-maxima method) is proposed, which can greatly facilitate the extraction of the multifractal spectrum feature (MSF) from speech signals. The MSF combined with traditional linear features can obviously improve the performance of speaker recognition system. Experiment results show that 6-dimensional MSF combined with LPC make recognition accuracy increase 6.4 percentage points, and 6-dimensional MSF combined with MFCC, LPC make recognition accuracy increase 1.6 percentage points and reach 98.8% in short speech (2 seconds) speaker recognition.

[1]  Tshilidzi Marwala,et al.  Improving Speaker Identification Rate Using Fractals , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[2]  Yi Li,et al.  Speaker gender identification based on combining linear and nonlinear features , 2008, 2008 7th World Congress on Intelligent Control and Automation.

[3]  Alain Arneodo,et al.  A multifractal formalism for vector-valued random fields based on wavelet analysis: application to turbulent velocity and vorticity 3D numerical data , 2008 .

[4]  Dante Augusto Couto Barone,et al.  Fractal dimension applied to speaker identification , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[5]  D. Barone,et al.  Speaker identification using nonlinear dynamical features , 2002 .

[6]  Emmanuel Bacry,et al.  Thermodynamics of fractal signals based on wavelet analysis: application to fully developed turbulence data and DNA sequences , 1998 .

[7]  Seung Ho Hong,et al.  New speaker recognition feature using correlation dimension , 2001, ISIE 2001. 2001 IEEE International Symposium on Industrial Electronics Proceedings (Cat. No.01TH8570).