Overview of MediaEval 2020 Predicting Media Memorability Task: What Makes a Video Memorable?

This paper describes the MediaEval 2020 Predicting Media Memorability task. After first being proposed at MediaEval 2018, the Predicting Media Memorability task is in its 3rd edition this year, as the prediction of short-term and long-term video memorability (VM) remains a challenging task. In 2020, the format remained the same as in previous editions. This year the videos are a subset of the TRECVid 2019 Video-to-Text dataset, containing more action rich video content as compared with the 2019 task. In this paper a description of some aspects of this task is provided, including its main characteristics, a description of the collection, the ground truth dataset, evaluation metrics and the requirements for participants’ run submissions.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Claire-Hélène Demarty,et al.  Deep Learning for Predicting Image Memorability , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Jianxiong Xiao,et al.  What Makes a Photograph Memorable? , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Mats Sjöberg,et al.  The Predicting Media Memorability Task at MediaEval 2019 , 2019, MediaEval.

[6]  Jonathan G. Fiscus,et al.  TRECVID 2019: An evaluation campaign to benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & retrieval , 2019, TRECVID.

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Aude Oliva,et al.  Multimodal Memorability: Modeling Effects of Semantics and Decay on Video Memorability , 2020, ECCV.

[9]  Mats Sjöberg,et al.  C V ] 3 J ul 2 01 8 MediaEval 2018 : Predicting Media Memorability , 2018 .

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Dong-Chen He,et al.  Texture Unit, Texture Spectrum And Texture Analysis , 1989, 12th Canadian Symposium on Remote Sensing Geoscience and Remote Sensing Symposium,.

[12]  Sumit Shekhar,et al.  Show and Recall: Learning What Makes Videos Memorable , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[13]  Antonio Torralba,et al.  Understanding and Predicting Image Memorability at a Large Scale , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Martin Engilberge,et al.  VideoMem: Constructing, Analyzing, Predicting Short-Term and Long-Term Video Memorability , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Nicu Sebe,et al.  Increasing Image Memorability with Neural Style Transfer , 2019, ACM Trans. Multim. Comput. Commun. Appl..