Calibration of probabilistic age recognition

The task in automatic age recognition in speech technology typically is one of regression, i.e., predicting the age of a speaker from his/her speech. In this paper we are interested in the probabilistic interpretation of the posterior distribution of the predicted age. We review a number of measures for assessing the probabilistic properties of the posterior distribution, and link these to detection theory, which is very well understood from the automatic speaker recognition literature. We show that the Gaussian posterior distributions predicted by least square support vector regression behave well, and that there is only a small room for improvement of their posterior distributions under the Gaussian assumption.

[1]  R. L. Winkler,et al.  Scoring Rules for Continuous Probability Distributions , 1976 .

[2]  David A. van Leeuwen,et al.  Results of the n-best 2008 dutch speech recognition evaluation , 2009, INTERSPEECH.

[3]  J. Bernardo Expected Information as Expected Utility , 1979 .

[4]  Elmar Nöth,et al.  Age Determination of Children in Preschool and Primary School Age with GMM-Based Supervectors and Support Vector Machines/Regression , 2008, TSD.

[5]  Riccardo Peirolo,et al.  Information gain as a score for probabilistic forecasts , 2011 .

[6]  David A. van Leeuwen,et al.  Source-Normalized LDA for Robust Speaker Recognition Using i-Vectors From Multiple Speech Sources , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Jukka Suomela,et al.  Lessons Learned in the Challenge: Making Predictions and Scoring Them , 2005, MLCW.

[9]  R. Shah,et al.  Least Squares Support Vector Machines , 2022 .

[10]  Niko Brümmer,et al.  Application-independent evaluation of speaker detection , 2006, Comput. Speech Lang..

[11]  Douglas E. Sturim,et al.  Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.

[12]  David A. van Leeuwen,et al.  Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006 , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Susanne Schötz,et al.  Acoustic Analysis of Adult Speaker Age , 2007, Speaker Classification.

[14]  Johan A. K. Suykens,et al.  Approximate Confidence and Prediction Intervals for Least Squares Support Vector Regression , 2011, IEEE Transactions on Neural Networks.

[15]  David A. van Leeuwen,et al.  An Introduction to Application-Independent Evaluation of Speaker Recognition Systems , 2007, Speaker Classification.