THU-HCSI at MediaEval 2016: Emotional Impact of Movies Task

In this paper we describe our team’s approach to MediaEval 2016 Challenge “Emotional Impact of Movies”. Except for the baseline features, we extract audio features and image features from video clips. We deploy Convolutional Neural Network (CNN) to extract image features and use OpenSMILE toolbox to extract audio ones. We also study multi-scale approach at different levels aiming at the continuous prediction task, using Long-short Term Memory (LSTM) and Bi-directional Long-short Term Memory (BLSTM) models. Fusion methods are also considered and discussed in this paper. The evaluation results show our approaches’ effectiveness.