Where to look at the movies : Analyzing visual attention to understand movie editing

In the process of making a movie, directors constantly care about where the spectator will look on the screen. Shot composition, framing, camera movements or editing are tools commonly used to direct attention. In order to provide a quantitative analysis of the relationship between those tools and gaze patterns, we propose a new eye-tracking database, containing gaze pattern information on movie sequences, as well as editing annotations, and we show how state-of-the-art computational saliency techniques behave on this dataset. In this work, we expose strong links between movie editing and spectators scanpaths, and open several leads on how the knowledge of editing information could improve human visual attention modeling for cinematic content. The dataset generated and analysed during the current study is available at https://github. com/abruckert/eye_tracking_filmmaking

[1]  Tim J. Smith,et al.  Watching you watch movies: Using eye tracking to inform cognitive film theory. , 2013 .

[2]  Ali Borji,et al.  Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study , 2013, IEEE Transactions on Image Processing.

[3]  Rémi Ronfard,et al.  The Prose Storyboard Language: A Tool for Annotating and Directing Movies , 2015, FDG 2015.

[4]  A. Rahman,et al.  Influence of number, location and size of faces on gaze in video , 2014 .

[5]  Stella X. Yu,et al.  Image Compression Based on Visual Saliency at Individual Scales , 2009, ISVC.

[6]  Nuno Vasconcelos,et al.  Discriminant Saliency, the Detection of Suspicious Coincidences, and Applications to Visual Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Bo Liu,et al.  Human Gaze Assisted Artificial Intelligence: A Review , 2020, IJCAI.

[8]  Lihi Zelnik-Manor,et al.  Learning Video Saliency from Human Gaze Using Candidate Selection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Dmitriy Vatolin,et al.  Semiautomatic visual-attention modeling and its application to video compression , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[10]  Ulrich Ansorge,et al.  The influence of color during continuity cuts in edited movies: an eye-tracking study , 2015, Multimedia Tools and Applications.

[11]  Noel E. O'Connor,et al.  SalGAN: Visual Saliency Prediction with Generative Adversarial Networks , 2017, ArXiv.

[12]  Rainer Goebel,et al.  Contextual Encoder-Decoder Network for Visual Saliency Prediction , 2019, Neural Networks.

[13]  Katarzyna Harezlak,et al.  Application of Eye Tracking for Diagnosis and Therapy of Children with Brain Disabilities , 2016 .

[14]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[15]  Christof Koch,et al.  Predicting human gaze using low-level saliency combined with face detection , 2007, NIPS.

[16]  Michael Dorr,et al.  Large-Scale Optimization of Hierarchical Features for Saliency Prediction in Natural Images , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Andrew T Duchowski,et al.  A breadth-first survey of eye-tracking applications , 2002, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[18]  John M. Henderson,et al.  Clustering of Gaze During Dynamic Scene Viewing is Predicted by Motion , 2011, Cognitive Computation.

[19]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[20]  John M. Henderson,et al.  Attentional synchrony in static and dynamic scenes , 2010 .

[21]  Eli Peli,et al.  Where people look when watching movies: Do all viewers look at the same place? , 2007, Comput. Biol. Medicine.

[22]  Patrick Le Callet,et al.  A coherent computational approach to model bottom-up visual attention , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Zhenzhong Chen,et al.  Video Saliency Prediction Based on Spatial-Temporal Two-Stream Network , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Thierry Baccino,et al.  Methods for comparing scanpaths and saliency maps: strengths and weaknesses , 2012, Behavior Research Methods.

[25]  Ivan V. Bajic,et al.  Saliency-Aware Video Compression , 2014, IEEE Transactions on Image Processing.

[26]  Matthew H Tong,et al.  SUN: Top-down saliency using natural statistics , 2009, Visual cognition.

[27]  T. Smith,et al.  Attentional synchrony and the influence of viewing task on gaze behavior in static and dynamic scenes. , 2013, Journal of vision.

[28]  Marc Christie,et al.  Deep Learning For Inter-Observer Congruency Prediction , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[29]  Qi Zhao,et al.  SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30]  Tie Liu,et al.  DeepVS: A Deep Learning Based Video Saliency Prediction Approach , 2018, ECCV.

[31]  Haibin Ling,et al.  Revisiting Video Saliency Prediction in the Deep Learning Era , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  David Breathnach,et al.  Attentional Synchrony and the Affects of Repetitve Movie Viewing , 2016, AICS.

[33]  Kacie L. Armstrong,et al.  Facial expression, size, and clutter: Inferences from movie structure to emotion judgments and back , 2016, Attention, Perception, & Psychophysics.

[34]  Liming Zhang,et al.  A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[35]  Ulrike Wirth Cinematography Theory And Practice Image Making For Cinematographers And Directors , 2016 .

[36]  Aykut Erdem,et al.  Spatio-Temporal Saliency Networks for Dynamic Saliency Prediction , 2016, IEEE Transactions on Multimedia.

[37]  Matthias Bethge,et al.  Measuring the Importance of Temporal Features in Video Saliency , 2020, ECCV.

[38]  J. Findlay Saccade Target Selection During Visual Search , 1997, Vision Research.

[39]  Ali Borji,et al.  Revisiting Video Saliency: A Large-Scale Benchmark and a New Model , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Katarzyna Harezlak,et al.  Application of eye tracking in medicine: A survey, research issues and challenges , 2017, Comput. Medical Imaging Graph..

[41]  Leon A. Gatys,et al.  Understanding Low- and High-Level Contributions to Fixation Prediction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  Ali Borji,et al.  Saliency Prediction in the Deep Learning Era: Successes and Limitations , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Rita Cucchiara,et al.  Predicting Human Eye Fixations via an LSTM-Based Saliency Attentive Model , 2016, IEEE Transactions on Image Processing.

[45]  Marc Christie,et al.  Analyzing Elements of Style in Annotated Film Clips , 2017, WICED@Eurographics.

[46]  Rémi Ronfard,et al.  Camera-on-rails: automated computation of constrained camera paths , 2015, MIG.

[47]  James J. Clark,et al.  Going from Image to Video Saliency: Augmenting Image Salience with Dynamic Attentional Push , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48]  Julie E. Boland,et al.  Cultural variation in eye movements during scene perception. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Justus H. Piater,et al.  Closed-Loop Learning of Visual Control Policies , 2011, J. Artif. Intell. Res..

[50]  Päivi Majaranta,et al.  Twenty years of eye typing: systems and design issues , 2002, ETRA.

[51]  Majid Nili Ahmadabadi,et al.  Cost-sensitive learning of top-down modulation for attentional control , 2009, Machine Vision and Applications.

[52]  Pat Hanrahan,et al.  Gaze Data for the Analysis of Attention in Feature Films , 2017, ACM Trans. Appl. Percept..

[53]  Yael Pritch,et al.  Content-aware compression using saliency-driven image retargeting , 2013, 2013 IEEE International Conference on Image Processing.

[54]  Lester C. Loschky,et al.  The Scene Perception & Event Comprehension Theory (SPECT) Applied to Visual Narratives , 2019, Top. Cogn. Sci..

[55]  K. Rayner,et al.  Eye movements when looking at unusual/weird scenes: are there cultural differences? , 2009, Journal of experimental psychology. Learning, memory, and cognition.

[56]  Esa Rahtu,et al.  DAVE: A Deep Audio-Visual Embedding for Dynamic Saliency Prediction , 2019 .

[57]  Bernt Schiele,et al.  Gaze Embeddings for Zero-Shot Image Classification , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Shenmin Zhang,et al.  What do saliency models predict? , 2014, Journal of vision.

[59]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[60]  Cristian Sminchisescu,et al.  Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61]  T. Foulsham,et al.  What can saliency models predict about eye movements? Spatial and sequential aspects of fixations during encoding and recognition. , 2008, Journal of vision.

[62]  Noel E. O'Connor,et al.  Shallow and Deep Convolutional Networks for Saliency Prediction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Nuno Vasconcelos,et al.  Spatiotemporal Saliency in Dynamic Scenes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[64]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[65]  R. Venkatesh Babu,et al.  DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations , 2015, IEEE Transactions on Image Processing.

[66]  Antoine Coutrot,et al.  Visual Attention Saccadic Models Learn to Emulate Gaze Patterns From Childhood to Adulthood , 2017, IEEE Transactions on Image Processing.