论文信息 - Combining regression and classification methods for improving automatic speaker age recognition

Combining regression and classification methods for improving automatic speaker age recognition

We present a novel approach to automatic speaker age classification, which combines regression and classification to achieve competitive classification accuracy on telephone speech. Support vector machine regression is used to generate finer age estimates, which are combined with the posterior probabilities of well-trained discriminative gender classifiers to predict both the age and gender of a speaker. We show that this combination performs better than direct 7-class classifiers. The regressors and classifiers are trained using longterm features such as pitch and formants, as well as short-term (frame-based) features derived from MAP adaptation of GMMs that were trained on MFCCs.

[1] Christian A. Müller,et al. Combining short-term cepstral and long-term pitch features for automatic recognition of speaker age , 2007, INTERSPEECH.

[2] Paul Boersma,et al. Praat, a system for doing phonetics by computer , 2002 .

[3] Elmar Nöth,et al. Analyzing features for automatic age estimation on cross-sectional data , 2009, INTERSPEECH.

[4] E. Mysak. Pitch and duration characteristics of older males. , 1959, Journal of speech and hearing research.

[5] Christian A. Müller,et al. Automatic recognition of speakers' age and gender on the basis of empirical studies , 2006, INTERSPEECH.

[6] Bernhard Schölkopf,et al. A tutorial on support vector regression , 2004, Stat. Comput..

[7] Florian Metze,et al. Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[9] Christian A. Müller,et al. Zweistufige kontextsensitive Sprecherklassifikation am Beispiel von Alter und Geschlecht , 2005 .