Voiced and Unvoiced Content of fear-type emotions in the SAFE Corpus

The present research focuses on the development of a fear detection system for surveillance applications based on acoustic cues. The emotional speech material used for this study comes from the previously collected SAFE Database (Situation Analysis in a Fictional and Emotional Database) which consists of audiovisual sequences extracted from movie fictions. We address here the question of a specific detection model based on unvoiced speech. In this purpose a set of features is considered for voiced and unvoiced speech. The salience of each feature is evaluated by computing the Fisher Discriminant Ratio for fear versus neutral discrimination. This study confirms that the voiced content and the prosodic features in particular are the most relevant. Finally the detection system merges information conveyed by both voiced and unvoiced acoustic content to enhance its performance. fear is recognized with 69.5% of success.