Determining optimal signal features and parameters for HMM-based emotion classification

The recognition of emotions from speech is a challenging issue. Creating emotion recognisers needs well defined signal features, parameter sets, and a huge amount of data material. Indeed, it is influenced by several conditions. This paper focuses on a proposal of an optimal parameter set for an HMM-based recogniser. For this, we compared different signal features (MFCCs, LPCs, and PLPs) as well as several architectures of HMMs. Moreover, we evaluated our proposal on three databases (eNTERFACE, Emo-DB, and SmartKom). Different proposals for acted/naive emotion recognition are given as well as recommendations for efficient and valid validation methods.