Classification of male and female speech using perceptual features

Gender identification systems nowadays, are gaining momentum in terms of popularity because of their wide areas of application. They can be used in a variety of fields ranging from security and authentication services to content based information retrieval and also criminal investigations. Gender detection has started to gain importance because of the fact that recent studies conducted showed that the performance of gender dependent speech recognition models performs much better than gender independent models. In the proposed work, we aim to build such a system involving perceptual audio features such as pitch and tempo based features, short time energy etc., which are used to train classifiers to differentiate between the two classes of gender. We have selected such a combination of features as because previous works focused only on either pitch approach, MFCC approach etc., whereas our work is perhaps one of the first involving a combination of several such perceptual features. The system was tested on a wide range of speech files and was shown to be yielding promising results.

[1]  Thomas Sikora,et al.  Audio classification based on MPEG-7 spectral basis representations , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Shariful Islam,et al.  GENDER RECOGNITION SYSTEM USING SPEECH SIGNAL , 2012 .

[3]  Joan Claudi Socoró,et al.  A Review of Physical and Perceptual Feature Extraction Techniques for Speech, Music and Environmental Sounds , 2016 .

[4]  Liming Chen,et al.  Gender identification using a general audio classifier , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[5]  Shrikanth S. Narayanan,et al.  An Overview on Perceptually Motivated Audio Indexing and Classification , 2013, Proceedings of the IEEE.

[6]  Ming-Hsuan Yang,et al.  Gender classification with support vector machines , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[7]  Ming Li,et al.  An Experimental Study on Automatic Face Gender Classification , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[8]  Tieniu Tan,et al.  A Study on Gait-Based Gender Classification , 2009, IEEE Transactions on Image Processing.

[9]  Meinard Müller,et al.  Chroma Toolbox: Matlab Implementations for Extracting Variants of Chroma-Based Audio Features , 2011, ISMIR.

[10]  Peter Grosche,et al.  Extracting Predominant Local Pulse Information From Music Recordings , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Preeti Rao,et al.  AUDIO SIGNAL CLASSIFICATION , 2004 .

[12]  Arijit Ghosal,et al.  Automatic male-female voice discrimination , 2014, 2014 International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT).

[13]  Gaurav Aggarwal,et al.  Speech Feature Extraction for Gender Recognition , 2016 .

[14]  Petri Toiviainen,et al.  A Matlab Toolbox for Music Information Retrieval , 2007, GfKl.

[15]  Birger Kollmeier,et al.  Robust speech detection in real acoustic backgrounds with perceptually motivated features , 2011, Speech Commun..

[16]  Isabel Trancoso,et al.  Age and gender classification using fusion of acoustic and prosodic features , 2010, INTERSPEECH.

[17]  Zhen-Yang Wu,et al.  Robust GMM Based Gender Classification using Pitch and RASTA-PLP Parameters of Speech , 2006, 2006 International Conference on Machine Learning and Cybernetics.

[18]  Claudio A. Perez,et al.  Gender Classification Based on Fusion of Different Spatial Scale Features Selected by Mutual Information From Histogram of LBP, Intensity, and Shape , 2013, IEEE Transactions on Information Forensics and Security.

[19]  Journals Iosr,et al.  Gender classification using face image and voice , 2015 .