论文信息 - A new optimum feature extraction and classification method for speaker recognition: GWPNN

A new optimum feature extraction and classification method for speaker recognition: GWPNN

Abstract Speech and speaker recognition is an important topic to be performed by a computer system. In this paper, an expert speaker recognition system based on optimum wavelet packet entropy is proposed for speaker recognition by using real speech/voice signal. This study contains both the combination of the new feature extraction and classification approach by using optimum wavelet packet entropy parameter values. These optimum wavelet packet entropy values are obtained from measured real English language speech/voice signal waveforms using speech experimental set. A genetic-wavelet packet-neural network (GWPNN) model is developed in this study. GWPNN includes three layers which are genetic algorithm, wavelet packet and multi-layer perception. The genetic algorithm layer of GWPNN is used for selecting the feature extraction method and obtaining the optimum wavelet entropy parameter values. In this study, one of the four different feature extraction methods is selected by using genetic algorithm. Alternative feature extraction methods are wavelet packet decomposition, wavelet packet decomposition – short-time Fourier transform, wavelet packet decomposition – Born–Jordan time–frequency representation, wavelet packet decomposition – Choi–Williams time–frequency representation. The wavelet packet layer is used for optimum feature extraction in the time–frequency domain and is composed of wavelet packet decomposition and wavelet packet entropies. The multi-layer perceptron of GWPNN, which is a feed-forward neural network, is used for evaluating the fitness function of the genetic algorithm and for classification speakers. The performance of the developed system has been evaluated by using noisy English speech/voice signals. The test results showed that this system was effective in detecting real speech signals. The correct classification rate was about 85% for speaker classification.

Engin Avci | E. Avci

[1] Engin Avci,et al. Intelligent Target Recognition Based on Wavelet Adaptive Network Based Fuzzy Inference System , 2005, IbPRIA.

[2] Marc Thuillard,et al. A Review of Wavelet Networks, Wavenets, Fuzzy Wavenets and their Applications , 2002, Advances in Computational Intelligence and Learning.

[3] Shubha L. Kadambe,et al. Applications of adaptive wavelets for speech , 1994 .

[4] Gianpaolo Evangelista,et al. Comb and multiplexed wavelet transforms and their applications to signal processing , 1994, IEEE Trans. Signal Process..

[5] R. Coifman,et al. Local feature extraction and its applications using a library of bases , 1994 .

[6] Jarosław Arabas,et al. Radar clutter classification using Kohonen neural network , 1997 .

[7] Jun Zhang,et al. Wavelet neural networks for function learning , 1995, IEEE Trans. Signal Process..

[8] Yousef Ajami Alotaibi. Investigating spoken Arabic digits in speech recognition setting , 2005, Inf. Sci..

[9] Gamini Dissanayake,et al. A wavelet- and neural network-based voice interface system for wheelchair control , 2005, Int. J. Intell. Syst. Technol. Appl..

[10] Ronald R. Coifman,et al. Entropy-based algorithms for best basis selection , 1992, IEEE Trans. Inf. Theory.

[11] Stéphane Mallat,et al. Characterization of Signals from Multiscale Edges , 2011, IEEE Trans. Pattern Anal. Mach. Intell..