International Workshop on Deep Video Understanding

This is the introduction paper to the International Workshop on Deep Video Understanding, organized at the 22nd ACM Interational Conference on Multimodal Interaction. In recent years, a growing trend towards working on understanding videos (in particular movies) to a deeper level started to motivate researchers working in multimedia and computer vision to present new approaches and datasets to tackle this problem. This is a challenging research area which aims to develop a deep understanding of the relations which exist between different individuals and entities in movies using all available modalities such as video, audio, text and metadata. The aim of this workshop is to foster innovative research in this new direction and to provide benchmarking evaluations to advance technologies in the deep video understanding community.

[1]  George Awad,et al.  HLVU: A New Challenge to Test Deep Understanding of Movies the Way Humans do , 2020, ICMR.

[2]  Wang Wei,et al.  Expressing Multimedia Content Using Semantics — A Vision , 2018, 2018 IEEE 12th International Conference on Semantic Computing (ICSC).

[3]  Sanja Fidler,et al.  MovieQA: Understanding Stories in Movies through Question-Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Vicente Ordonez,et al.  Moviescope: Large-scale Analysis of Movies using Multiple Modalities , 2019, ArXiv.