A speech signal based gender identification system using four classifiers

This paper presents a study of four different classifiers in the task of automatic speech based gender identification. Gender identification could have several applications in automatic speech and speaker recognition systems and in content -based multimedia indexing. Gaussian mixture model (GMM), multilayer perceptrons (MLP), vector quantization (VQ) and learning vector quantization (LVQ) are the classifiers used in this work along with mel frequency cepstral coefficients (MFCC). The performance attained by our best system is 96.4% identification accuracy using only 1s of speech per speaker using the IViE corpus.

[1]  Keikichi Hirose,et al.  Automatic estimation of one's age with his/her speech based upon acoustic modeling techniques of speakers , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Hervé Bourlard,et al.  Neural networks for statistical recognition of continuous speech , 1995, Proc. IEEE.

[3]  Milan Sigmund,et al.  Gender Distinction Using Short Segments of Speech Signal , 2008 .

[4]  Michael J. Carey,et al.  Language independent gender identification , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[5]  Wei Zhang,et al.  EM algorithms of Gaussian mixture model and hidden Markov model , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[6]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[7]  Francis Nolan,et al.  The IViE Corpus , 2014 .

[8]  Liming Chen,et al.  Voice-Based Gender Identification in Multimedia Applications , 2005, Journal of Intelligent Information Systems.

[9]  R. Rajeshwara Rao,et al.  Glottal Excitation Feature based Gender Identification System using Ergodic HMM , 2011 .

[10]  Sang-Ick Kang,et al.  A Support Vector Machine-Based Gender Identification Using Speech Signal , 2008, IEICE Trans. Commun..

[11]  R. Rao Source Feature Based Gender Identification System Using GMM , 2011 .

[12]  Nikos Fakotakis,et al.  Comparative Evaluation of Various MFCC Implementations on the Speaker Verification Task , 2007 .

[13]  C. Neti,et al.  Phone-context specific gender-dependent acoustic-models for continuous speech recognition , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[14]  Florian Metze,et al.  Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[15]  Panu Somervuo,et al.  Self-Organizing Maps and Learning Vector Quantization for Feature Sequences , 1999, Neural Processing Letters.

[16]  Alex Acero,et al.  Speaker and gender normalization for continuous-density hidden Markov models , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.