Combining regression and classification methods for improving automatic speaker age recognition

We present a novel approach to automatic speaker age classification, which combines regression and classification to achieve competitive classification accuracy on telephone speech. Support vector machine regression is used to generate finer age estimates, which are combined with the posterior probabilities of well-trained discriminative gender classifiers to predict both the age and gender of a speaker. We show that this combination performs better than direct 7-class classifiers. The regressors and classifiers are trained using longterm features such as pitch and formants, as well as short-term (frame-based) features derived from MAP adaptation of GMMs that were trained on MFCCs.