论文信息 - Identification of age-group from children's speech by computers and humans

Identification of age-group from children's speech by computers and humans

This paper presents results on age-group identification (AgeID) for children’s speech, using the OGI Kids corpus and GMM-UBM, GMM-SVM and i-vector systems. Regions of the spectrum containing important age information for children are identified by conducting Age-ID experiments over 21 frequency sub-bands. Results show that the frequencies above 5.5 kHz are least useful for Age-ID. The effect of using genderindependent and gender-dependent age-group modelling is explored. The GMM-UBM and i-vector systems considerably outperform the GMM-SVM system. The best Age-ID performance of 85.77% is obtained by the i-vector system applied to band-limited speech to 5.5 kHz. Experiments on human AgeID were also conducted and the results show that the humans do not achieve the performance of the machine.

[1] R. Cole,et al. THE OGI KIDS’ SPEECH CORPUS AND RECOGNIZERS , 2000 .

[2] Maryam Najafian,et al. Speaker Recognition for Children's Speech , 2016, INTERSPEECH.

[3] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[4] Björn W. Schuller,et al. Paralinguistics in speech and language - State-of-the-art and the challenge , 2013, Comput. Speech Lang..

[5] Florian Metze,et al. Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[6] Sridha Sridharan,et al. Feature warping for robust speaker verification , 2001, Odyssey.

[7] Elmar Nöth,et al. Age and gender recognition based on multiple systems - early vs. late fusion , 2010, INTERSPEECH.

[8] Christian A. Müller,et al. Automatic recognition of speakers' age and gender on the basis of empirical studies , 2006, INTERSPEECH.

[9] David A. van Leeuwen,et al. Calibration of probabilistic age recognition , 2012, INTERSPEECH.

[10] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[11] Douglas E. Sturim,et al. SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[12] Saeid Safavi,et al. Identification of gender from children's speech by computers and humans , 2013, INTERSPEECH.

[13] Hugo Van hamme,et al. Age Estimation from Telephone Speech using i-vectors , 2012, INTERSPEECH.

[14] Douglas E. Sturim,et al. The MITLL NIST LRE 2009 language recognition system , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15] Shrikanth S. Narayanan,et al. Automatic speaker age and gender recognition using acoustic and prosodic level information fusion , 2013, Comput. Speech Lang..

[16] Elmar Nöth,et al. Age Determination of Children in Preschool and Primary School Age with GMM-Based Supervectors and Support Vector Machines/Regression , 2008, TSD.

[17] Saeid Safavi,et al. Contrasting the Effects of Different Frequency Bands on Speaker and Accent Identification , 2012, IEEE Signal Processing Letters.

[18] Larry P. Heck,et al. MSR Identity Toolbox v1.0: A MATLAB Toolbox for Speaker Recognition Research , 2013 .