Multimodal Sparse Coding for Event Detection

Abstract : Unsupervised feature learning methods have proven effective for classification tasks based on single modality. We present multimodal sparse coding for learning feature representations shared across multiple modalities. The shared representations are applied to multimedia event detection (MED) and evaluated in comparison to unimodal counterparts, as well as other feature learning methods such as sparse auto encoder and RBM. We report the cross-validated classification accuracy and mean average precision of the MED system trained on features learned from our unimodal and multimodal settings for the TRECVID MED 2014 dataset.