Identification of age-group from children's speech by computers and humans

This paper presents results on age-group identification (AgeID) for children’s speech, using the OGI Kids corpus and GMM-UBM, GMM-SVM and i-vector systems. Regions of the spectrum containing important age information for children are identified by conducting Age-ID experiments over 21 frequency sub-bands. Results show that the frequencies above 5.5 kHz are least useful for Age-ID. The effect of using genderindependent and gender-dependent age-group modelling is explored. The GMM-UBM and i-vector systems considerably outperform the GMM-SVM system. The best Age-ID performance of 85.77% is obtained by the i-vector system applied to band-limited speech to 5.5 kHz. Experiments on human AgeID were also conducted and the results show that the humans do not achieve the performance of the machine.

[1]  R. Cole,et al.  THE OGI KIDS’ SPEECH CORPUS AND RECOGNIZERS , 2000 .

[2]  Maryam Najafian,et al.  Speaker Recognition for Children's Speech , 2016, INTERSPEECH.

[3]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[4]  Björn W. Schuller,et al.  Paralinguistics in speech and language - State-of-the-art and the challenge , 2013, Comput. Speech Lang..

[5]  Florian Metze,et al.  Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[6]  Sridha Sridharan,et al.  Feature warping for robust speaker verification , 2001, Odyssey.

[7]  Elmar Nöth,et al.  Age and gender recognition based on multiple systems - early vs. late fusion , 2010, INTERSPEECH.

[8]  Christian A. Müller,et al.  Automatic recognition of speakers' age and gender on the basis of empirical studies , 2006, INTERSPEECH.

[9]  David A. van Leeuwen,et al.  Calibration of probabilistic age recognition , 2012, INTERSPEECH.

[10]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Douglas E. Sturim,et al.  SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[12]  Saeid Safavi,et al.  Identification of gender from children's speech by computers and humans , 2013, INTERSPEECH.

[13]  Hugo Van hamme,et al.  Age Estimation from Telephone Speech using i-vectors , 2012, INTERSPEECH.

[14]  Douglas E. Sturim,et al.  The MITLL NIST LRE 2009 language recognition system , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15]  Shrikanth S. Narayanan,et al.  Automatic speaker age and gender recognition using acoustic and prosodic level information fusion , 2013, Comput. Speech Lang..

[16]  Elmar Nöth,et al.  Age Determination of Children in Preschool and Primary School Age with GMM-Based Supervectors and Support Vector Machines/Regression , 2008, TSD.

[17]  Saeid Safavi,et al.  Contrasting the Effects of Different Frequency Bands on Speaker and Accent Identification , 2012, IEEE Signal Processing Letters.

[18]  Larry P. Heck,et al.  MSR Identity Toolbox v1.0: A MATLAB Toolbox for Speaker Recognition Research , 2013 .