On the Performance of Hurst-Vectors for Speaker Identification Systems

The performance of Hurst-Vectors (pH feature) for speaker identification systems is presented and discussed in this paper. The pH feature is a vector of Hurst (H) parameters obtained by applying a wavelet-based multi-dimensional estimator (M_dim_wavelets) to the windowed short-time segments of speech. The GMM (Gaussian Mixture Models) and the M_dim_fBm (multi-dimensional fractional Brownian motion) classification systems were considered in the performance analysis. The database—recorded from fixed and cellular phone channels— was uttered by 75 different speakers. The results have shown the superior performance of the M_dim_fBm classifier and that the pH feature aggregates new information on the speaker identity.

[1]  Rosângela Coelho,et al.  Text-independent speaker recognition based on the Hurst parameter and the multidimensional fractional Brownian motion model , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Matthew Roughan,et al.  Real-time estimation of the parameters of long-range dependence , 2000, TNET.

[3]  J. Echauz,et al.  Fractal dimension characterizes seizure onset in epileptic patients , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[4]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .

[5]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[6]  A. Alcaim,et al.  Automatic speaker verification based on fractional Brownian motion process , 2004 .

[7]  H. E. Hurst,et al.  Long-Term Storage Capacity of Reservoirs , 1951 .

[8]  Heinz-Otto Peitgen,et al.  The science of fractal images , 2011 .

[9]  Patrice Abry,et al.  A Wavelet-Based Joint Estimator of the Parameters of Long-Range Dependence , 1999, IEEE Trans. Inf. Theory.

[10]  Dante Augusto Couto Barone,et al.  Fractal dimension applied to speaker identification , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[11]  Douglas A. Reynolds,et al.  Speaker identification and verification using Gaussian mixture speaker models , 1995, Speech Commun..

[12]  T. Higuchi Approach to an irregular time series on the basis of the fractal theory , 1988 .

[13]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[14]  J. Kumagai Talk to the machine , 2002 .

[15]  Y. Hashimoto,et al.  Pattern recognition of fruit shape based on the concept of chaos and neural networks , 2000 .

[16]  Jan Beran,et al.  Statistics for long-memory processes , 1994 .

[17]  Alvin F. Martin,et al.  The NIST speaker recognition evaluation program , 2005 .