Speaker identification using discrete wavelet packet transform technique with irregular decomposition

This paper presents the study of speaker identification for security systems based on the energy of speaker utterances. The proposed system consisted of a combination of signal pre-process, feature extraction using wavelet packet transform (WPT) and speaker identification using artificial neural network. In the signal pre-process, the amplitude of utterances, for a same sentence, were normalized for preventing an error estimation caused by speakers' change in volume. In the feature extraction, three conventional methods were considered in the experiments and compared with the irregular decomposition method in the proposed system. In order to verify the effect of the proposed system for identification, a general regressive neural network (GRNN) was used and compared in the experimental investigation. The experimental results demonstrated the effectiveness of the proposed speaker identification system and were compared with the discrete wavelet transform (DWT), conventional WPT and WPT in Mel scale.

[1]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[2]  Zekeriya Tufekci,et al.  Mel-scaled discrete wavelet coefficients for speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3]  Shung-Yung Lung Wavelet feature selection based neural networks with application to the text independent speaker identification , 2006, Pattern Recognit..

[4]  A. Grossmann,et al.  Cycle-octave and related transforms in seismic signal analysis , 1984 .

[5]  William J. Phillips,et al.  New low rate wavelet models for the recognition of single spoken digits , 2000, 2000 Canadian Conference on Electrical and Computer Engineering. Conference Proceedings. Navigating to a New Era (Cat. No.00TH8492).

[6]  J.H.L. Hansen,et al.  High resolution speech feature parametrization for monophone-based stressed speech recognition , 2000, IEEE Signal Processing Letters.

[7]  Farshad Almasganj,et al.  Optimal selection of wavelet-packet-based features using genetic algorithm in pathological assessment of patients' speech signal with unilateral vocal fold paralysis , 2007, Comput. Biol. Medicine.

[8]  Louis D. Braida,et al.  Human and machine consonant recognition , 2005, Speech Commun..

[9]  Engin Avci,et al.  Speech recognition using a wavelet packet adaptive network based fuzzy inference system , 2006, Expert Syst. Appl..

[10]  Misha Pavel,et al.  On the relative importance of various components of the modulation spectrum for automatic speech recognition , 1999, Speech Commun..

[11]  Zdravko Kacic,et al.  The usage of wavelet packet transformation in automatic noisy speech recognition systems , 2003, The IEEE Region 8 EUROCON 2003. Computer as a Tool..

[12]  Carlos Dias Maciel,et al.  Wavelet time-frequency analysis and least squares support vector machines for the identification of voice disorders , 2007, Comput. Biol. Medicine.

[13]  Omar Farooq,et al.  Mel filter-like admissible wavelet packet structure for speech recognition , 2001, IEEE Signal Processing Letters.

[14]  K. Loparo,et al.  Bearing fault diagnosis based on wavelet transform and fuzzy inference , 2004 .

[15]  Robert F. Port,et al.  Effects of temporal correction on intelligibility of foreign-accented English , 1997 .

[16]  Colin Murray Parkes Obe Md Dpm FRCPsych,et al.  Seventh International Conference , 2009 .

[17]  Andreas Rieder,et al.  Wavelets: Theory and Applications , 1997 .

[18]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[19]  Hugo Leonardo Rufiner,et al.  Automatic speaker identification by means of Mel cepstrum, wavelets and wavelet packets , 2000, Proceedings of the 22nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (Cat. No.00CH37143).

[20]  Daniel J. Mashao,et al.  Combining classifier decisions for robust speaker identification , 2006, Pattern Recognit..

[21]  Weaam Alkhaldi,et al.  Multi-band based recognition of spoken Arabic numerals using wavelet transform , 2002, Proceedings of the Nineteenth National Radio Science Conference.

[22]  Engin Avci,et al.  An expert Discrete Wavelet Adaptive Network Based Fuzzy Inference System for digital modulation recognition , 2007, Expert Syst. Appl..

[23]  Zhang Xueying,et al.  Speech recognition based on auditory wavelet packet filter , 2004, Proceedings 7th International Conference on Signal Processing, 2004. Proceedings. ICSP '04. 2004..

[24]  Dante Augusto Couto Barone,et al.  A speaker identification system using a model of artificial neural networks for an elevator application , 2001, Inf. Sci..