Audio-based gender and age identification

Nowadays interaction between humans and computers is increasing rapidly. Efficiency and comfort of these interactions depend on the availability of user information to computers. Gender, age and emotional state are most the most fundamental pieces of these information. Extraction of such information from audio or video data is an important research area. There are several works on different languages including especially English, German and Italian. In this study, we developed a system for Turkish language to extract gender and age information from the speech. Our test results show that the proposed system is able to identify the gender and age range of the speaker with a success rate of 99%.

[1]  Mohammed Yeasin,et al.  Support Vector Learning for Gender Classification Using Audio and Visual Cues: A Comparison , 2002, SVM.

[2]  Francesco Archetti,et al.  Audio-Based Emotion Recognition in Judicial Domain: A Multilayer Support Vector Machines Approach , 2009, MLDM.

[3]  Florian Metze,et al.  Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[4]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[5]  Liming Chen,et al.  Voice-Based Gender Identification in Multimedia Applications , 2005, Journal of Intelligent Information Systems.

[6]  Francesco Archetti,et al.  Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain , 2008, ICT4Justice.

[7]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[8]  Søren Holdt Jensen,et al.  Audio-based age and gender identification to enhance the recommendation of TV content , 2013, IEEE Transactions on Consumer Electronics.

[9]  Elisabeth André,et al.  Improving Automatic Emotion Recognition from Speech via Gender Differentiaion , 2006, LREC.

[10]  Alex Acero,et al.  Speaker and gender normalization for continuous-density hidden Markov models , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[11]  Constantine Kotropoulos,et al.  Gender classification in two Emotional Speech databases , 2008, 2008 19th International Conference on Pattern Recognition.

[12]  C. Neti,et al.  Phone-context specific gender-dependent acoustic-models for continuous speech recognition , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.