Predicting Media Memorability Using Ensemble Models

Memorability, defined as the quality of being worth remembering, is a pressing issue in media as we struggle to organize and retrieve digital content and make it more useful in our daily lives. The Predicting Media Memorability task in MediaEval 2019 tackles this problem by creating a challenge to automatically predict memorability scores building on the work developed in 2018. Our team ensembled transfer learning approaches with video captions using embeddings and our own pre-computed features which outperformed Medieval 2018’s state-of-the-art architectures.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Minh-Triet Tran,et al.  Predicting Media Memorability Using Deep Features and Recurrent Network , 2018, MediaEval.

[3]  Rohit Gupta,et al.  Linear Models for Video Memorability Prediction Using Visual and Semantic Features , 2018, MediaEval.

[4]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[5]  Claire-Hélène Demarty,et al.  Annotating, Understanding, and Predicting Long-term Video Memorability , 2018, ICMR.

[6]  Alan F. Smeaton,et al.  Image Aesthetics and Content in Selecting Memorable Keyframes from Lifelogs , 2018, MMM.

[7]  Din J. Wasem,et al.  Mining of Massive Datasets , 2014 .

[8]  François Chollet,et al.  Deep Learning mit Python und Keras , 2018 .

[9]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[10]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[12]  Claire-Hélène Demarty,et al.  Deep Learning for Predicting Image Memorability , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Savita Bhat,et al.  Multimodal Approach to Predicting Media Memorability , 2018, MediaEval.

[14]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[15]  Sumit Shekhar,et al.  Show and Recall: Learning What Makes Videos Memorable , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[16]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[19]  Matias Valdenegro-Toro,et al.  Real-time Convolutional Neural Networks for emotion and gender classification , 2017, ESANN.

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[22]  Mats Sjöberg,et al.  The Predicting Media Memorability Task at MediaEval 2019 , 2019, MediaEval.

[23]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[24]  Alan F. Smeaton,et al.  Dublin's Participation in the Predicting Media Memorability Task at MediaEval 2018 , 2018, MediaEval.