Considering relative order of emotional degree in dimensional speech emotion recognition

Dimensional speech emotion recognition(Dim-SER) is a rising branch of emotion computing field.It views emotion from dimensional and continuous perspective,and formalizesthe SER problem as a regression task.Current Dim-SER researches never consider the relative order of emotional degree between utterances,which would makethe human-machine interface get wrong information about speaker' s emotion variation trend.Starting from this demand,this paper constructs an order sensitive Dim-SER system with the human emotion cognitive characteristics as reference,and employsGamma statisticto evaluate emotion recognition performance.Specifically, the Top-rank probability distribution is developed to describethe emotional ordering of utterances,and the Kullback-Leibler divergence is usedto measure the loss of order consistency caused by emotion recognition.Finally,the Order-Senstive Network(OSNet) algorithm is proposed to minimized prediction loss.Experimental results show that,compared with the commonly usedA-Nearest Neighbor (k-NN) and Support Vector Regression(SVR) approaches,the proposed system effectively improve thecorrectness of emotional relative order between utterances.