Pitch envelope based frame level score reweighed algorithm for emotion robust speaker recognition

Speech with various emotions aggravates the performance of speaker recognition systems. In this paper, a novel score normalization approach called pitch envelope based frame level score reweighted (PFLSR) algorithm is introduced to compensate the influence of the affective speech on speaker recognition. The approach assumes that the maximum likelihood model is not easily changed with the expressive corruption for most of the frames. Thus the test frames are divided into two parts according to F0, the heavily affected ones and the slightly affected ones. The confidences of the slightly affected frames are reweighted into new scores to strengthen their confidence, and to optimize the final accumulated frame scores over the whole test utterance. The experiments are conducted on the Mandarin Affective Speech Corpus. An improvement of 15.1% in identification rate over the traditional speaker recognition is achieved.