Combination of Multiple Classifiers for Improving Emotion Recognition in Mandarin Speech

Automatic emotional speech recognition system can be characterized by the selected features, the investigated emotional categories, the methods to collect speech utterances, the languages, and the type of classifier used in the experiments. Until now, several classifiers are adopted independently and tested on numerous emotional speech corpora but no any classifier is enough to classify the emotional classes optimally. In this paper, we focus on combination schemes of multiple classifiers to achieve best possible recognition rate for the task of 5-classes emotion recognition in Mandarin speech. The investigated classifiers include KNN, WKNN, WCAP, W-DKNN and SVM. The experimental results have shown that classifier combination schemes, including majority voting method, minimum misclassification method and maximum accuracy method, perform better than the single classifiers in terms of overall accuracy with improvements ranging from 0.9%~6.5%.