A new optimum feature extraction and classification method for speaker recognition: GWPNN

Abstract Speech and speaker recognition is an important topic to be performed by a computer system. In this paper, an expert speaker recognition system based on optimum wavelet packet entropy is proposed for speaker recognition by using real speech/voice signal. This study contains both the combination of the new feature extraction and classification approach by using optimum wavelet packet entropy parameter values. These optimum wavelet packet entropy values are obtained from measured real English language speech/voice signal waveforms using speech experimental set. A genetic-wavelet packet-neural network (GWPNN) model is developed in this study. GWPNN includes three layers which are genetic algorithm, wavelet packet and multi-layer perception. The genetic algorithm layer of GWPNN is used for selecting the feature extraction method and obtaining the optimum wavelet entropy parameter values. In this study, one of the four different feature extraction methods is selected by using genetic algorithm. Alternative feature extraction methods are wavelet packet decomposition, wavelet packet decomposition – short-time Fourier transform, wavelet packet decomposition – Born–Jordan time–frequency representation, wavelet packet decomposition – Choi–Williams time–frequency representation. The wavelet packet layer is used for optimum feature extraction in the time–frequency domain and is composed of wavelet packet decomposition and wavelet packet entropies. The multi-layer perceptron of GWPNN, which is a feed-forward neural network, is used for evaluating the fitness function of the genetic algorithm and for classification speakers. The performance of the developed system has been evaluated by using noisy English speech/voice signals. The test results showed that this system was effective in detecting real speech signals. The correct classification rate was about 85% for speaker classification.

[1]  Engin Avci,et al.  Intelligent Target Recognition Based on Wavelet Adaptive Network Based Fuzzy Inference System , 2005, IbPRIA.

[2]  Marc Thuillard,et al.  A Review of Wavelet Networks, Wavenets, Fuzzy Wavenets and their Applications , 2002, Advances in Computational Intelligence and Learning.

[3]  Shubha L. Kadambe,et al.  Applications of adaptive wavelets for speech , 1994 .

[4]  Gianpaolo Evangelista,et al.  Comb and multiplexed wavelet transforms and their applications to signal processing , 1994, IEEE Trans. Signal Process..

[5]  R. Coifman,et al.  Local feature extraction and its applications using a library of bases , 1994 .

[6]  Jarosław Arabas,et al.  Radar clutter classification using Kohonen neural network , 1997 .

[7]  Jun Zhang,et al.  Wavelet neural networks for function learning , 1995, IEEE Trans. Signal Process..

[8]  Yousef Ajami Alotaibi Investigating spoken Arabic digits in speech recognition setting , 2005, Inf. Sci..

[9]  Gamini Dissanayake,et al.  A wavelet- and neural network-based voice interface system for wheelchair control , 2005, Int. J. Intell. Syst. Technol. Appl..

[10]  Ronald R. Coifman,et al.  Entropy-based algorithms for best basis selection , 1992, IEEE Trans. Inf. Theory.

[11]  Stéphane Mallat,et al.  Characterization of Signals from Multiscale Edges , 2011, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Ahmet Arslan,et al.  An intelligent system for diagnosis of the heart valve diseases with wavelet packet neural networks , 2003, Comput. Biol. Medicine.

[13]  Nikos Fakotakis,et al.  Wavelet Packets Based Speaker Verification , 2004 .

[14]  Gianpaolo Evangelista,et al.  Pitch-synchronous wavelet representations of speech and music signals , 1993, IEEE Trans. Signal Process..

[15]  David L. Donoho,et al.  WaveLab and Reproducible Research , 1995 .

[16]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[18]  Boualem Boashash,et al.  Time-Frequency Signal Analysis: Methods and Applications. , 1993 .

[19]  Engin Avci,et al.  Intelligent target recognition based on wavelet packet neural network , 2005, Expert Syst. Appl..

[20]  Harold H. Szu,et al.  Neural network adaptive wavelets for signal representation and classification , 1992 .

[21]  M. Victor Wickerhauser,et al.  Adapted local trigonometric transforms and speech processing , 1993, IEEE Trans. Signal Process..

[22]  Stephane Maes Nonlinear techniques for parameter extraction from quasi-continuous wavelet transform with application to speech , 1994, Other Conferences.

[23]  A. Antoniadis,et al.  Wavelets and Statistics , 1995 .

[24]  Shubha Kadambe,et al.  Application of the wavelet transform for pitch detection of speech signals , 1992, IEEE Trans. Inf. Theory.

[25]  Qinghua Zhang,et al.  Wavelet networks , 1992, IEEE Trans. Neural Networks.

[26]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[27]  Te-Won Lee,et al.  A Spatio-Temporal Speech Enhance Speech Recogn , 2002 .