360RAT: A Tool for Annotating Regions of Interest in 360-degree Videos

This paper introduces the software 360RAT as a tool for annotating regions of interest (RoIs) in 360-degree videos. These regions represent the portions of the video content that are important for telling a story throughout the video. We believe that this software is an invaluable tool for studying different aspects of 360-degree videos, including what viewers consider relevant and interesting to the experience. As part of this work, we conducted a subjective experiment in which 9 human observers used the proposed software to annotate 11 360-degree videos. As a result, we created a dataset containing a set of annotated 360-degree videos, i.e., videos with marked RoIs and their semantic classification. We present a simple analysis of the annotations gathered with the experiment for a subset of the videos. We noticed that there is a higher agreement of annotations among participants for videos with fewer objects. We also compared the RoI maps with saliency maps computed with the Cube Padding saliency model. We found a strong correlation between RoI maps and computed saliency models, indicating a link between the annotated RoI and the saliency properties of the content.

[1]  D. Muchaluat-Saade,et al.  Sensory Effect Extraction for 360° Media Content , 2021, WebMedia.

[2]  Ming Ouhyoung,et al.  Label360: An Annotation Interface for Labeling Instance-Aware Semantic Labels on Panoramic Full Images , 2020, SIGGRAPH Asia Posters.

[3]  Lemonia Argyriou,et al.  Design methodology for 360° immersive video applications: the case study of a cultural heritage virtual tour , 2020, Personal and Ubiquitous Computing.

[4]  Margrit Gelautz,et al.  A tool for semi-automatic ground truth annotation of traffic videos , 2020, Electronic Imaging.

[5]  Hannes Fassold,et al.  Towards Automatic Cinematography and Annotation for 360° Video , 2019, TVX.

[6]  Mylène C. Q. Farias,et al.  A taxonomy and dataset for 360° videos , 2019, MMSys.

[7]  Abhishek Dutta,et al.  The VGG Image Annotator (VIA) , 2019, ArXiv.

[8]  Yi-Ping Hung,et al.  Virtual Reality as an Art Form , 2019, Proceedings of International Conference on Artificial Life and Robotics.

[9]  Frédo Durand,et al.  What Do Different Evaluation Metrics Tell Us About Saliency Models? , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Aljoscha Smolic,et al.  Director's cut: a combined dataset for visual attention analysis in cinematic VR content , 2018, CVMP '18.

[11]  Peter Wonka,et al.  PanoAnnotator: a semi-automatic tool for indoor panorama layout annotation , 2018, SIGGRAPH ASIA Posters.

[12]  Lucile Sassatelli,et al.  Film editing: new levers to improve VR streaming , 2018, MMSys.

[13]  Min Sun,et al.  Cube Padding for Weakly-Supervised Saliency Prediction in 360° Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Marco Winckler,et al.  Snap-changes: a dynamic editing strategy for directing viewer's attention in streaming virtual reality videos , 2018, AVI.

[15]  Matthias Bethge,et al.  Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics , 2017, ECCV.

[16]  A. Alencar Narrative cues within cinematic virtual reality An exploratory study of narrative cues within the content and motives of virtual reality developers , 2018 .

[17]  Gordon Wetzstein,et al.  Movie editing and cognitive event segmentation in virtual reality video , 2017, ACM Trans. Graph..

[18]  Ming-Yu Liu,et al.  Deep 360 Pilot: Learning a Deep Agent for Piloting through 360° Sports Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Jayesh S. Pillai,et al.  Grammar of VR Storytelling: Visual Cues , 2017, VRIC.

[20]  Eric D. Ragan,et al.  Coordinating attention and cooperation in multi-user virtual reality narratives , 2017, 2017 IEEE Virtual Reality (VR).

[21]  Tahir Nawaz,et al.  ViTBAT: Video tracking and behavior annotation tool , 2016, 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[22]  Anting Shen BeaverDam : Video Annotation Tool for Computer Vision Training Labels , 2016 .

[23]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[24]  Donald J. Patterson,et al.  Efficiently Scaling up Crowdsourced Video Annotation , 2012, International Journal of Computer Vision.

[25]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.