Speech Feature Extraction for Gender Recognition

Speech Recognition Technology can be embedded in various real time applications in order to increase the human-computer interaction. From robotics to health care and aerospace, from interactive voice response systems to mobile telephony and telematics, speech recognition technology have enhanced the human- machine interaction. Gender recognition is an important component for the application embedding speech recognition as it reduces the computational complexity for the further processing in these applications. The paper involves the extraction of one of the most dominant and most researched up on speech feature, Mel coefficients and its first and second order derivatives. We extracted 13 values for each of these from a data-set 46 speech samples containing the Hindi vowels (आ, इ, ई, उ, ऊ, ऋ, ए, ऎ, ऒ, ऑ) and trained them using a combined model of SVM and neural network classification to determine their gender using stacking. The results obtained showed the accuracy of 93.48% after taking into consideration the first Mel coefficient. The purpose of this study was to extract the correct features and to compare the performance based on first Mel coefficient. Index Terms—Gender recognition, Hindi, mel-frequency, delta, delta-delta, neural network.

[1]  Jerzy Sas,et al.  Gender recognition using neural networks and ASR techniques , 2013 .

[2]  Ho-Sub Yoon,et al.  Age and Gender Classification for a Home-Robot Service , 2007, RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication.

[3]  Han Xu,et al.  Research on Different Feature Parameters in Speaker Recognition , 2013 .

[4]  K. L. Bansal,et al.  Comparative Study of Data Mining Tools , 2014 .

[5]  Jiawei Han,et al.  Data Mining: Concepts and Techniques, Second Edition , 2006, The Morgan Kaufmann series in data management systems.

[6]  M. A. Anusuya,et al.  Front end analysis of speech recognition: a review , 2011, Int. J. Speech Technol..

[7]  Gaurav,et al.  Automatic Gender Identification for Hindi Speech Recognition , 2011 .

[8]  I. Elamvazuthi,et al.  Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques , 2010, ArXiv.

[9]  Shrikanth S. Narayanan,et al.  Automatic speaker age and gender recognition using acoustic and prosodic level information fusion , 2013, Comput. Speech Lang..

[10]  Pushpa Rani,et al.  An Approach to Extract Feature using MFCC , 2014 .

[11]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[12]  Sung Wook Baik,et al.  Gender Identification using MFCC for Telephone Applications - A Comparative Study , 2016, ArXiv.

[13]  Nidhi H. Ruparel,et al.  Learning from Small Data Set to Build Classification Model: A Survey , 2013 .