Sound influences visual attention discriminately in videos

In the long term research of visual attention, various computational models have been proposed. However, most of those works do not take audio into consideration. In practice, visual signals often come along with audio. Therefore, it is natural to investigate the influence of audio on visual attention. In this paper, we focus on the problem of when will audio influence visual attention during video watching. We perform eye-tracking experiments on a set of 60 videos in audio-visual (AV) and visual (V) conditions. Based on the eye movement data, we find that the influence of audio on visual attention depends on the consistency between the visual and audio signals. If the salient objects from the visual perspective are not consistent with the salient objects from the audio perspective, audio will influence visual attention. Otherwise, audio has little influence on visual attention.

[1]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[2]  P. König,et al.  Audio-visual integration during overt visual attention , 2008 .

[3]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[4]  Asha Iyer,et al.  Components of bottom-up gaze allocation in natural images , 2005, Vision Research.

[5]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[6]  D. Pellerin,et al.  Different types of sounds influence gaze differently in videos , 2013 .

[7]  Antoine Coutrot,et al.  Influence of soundtrack on eye movements during video exploration , 2012 .

[8]  Michael T. Lippert,et al.  Mechanisms for Allocating Auditory Attention: An Auditory Saliency Map , 2005, Current Biology.

[9]  Touradj Ebrahimi,et al.  Efficient video coding based on audio-visual focus of attention , 2011, J. Vis. Commun. Image Represent..

[10]  D. Whitaker,et al.  Sensory uncertainty governs the extent of audio-visual interaction , 2004, Vision Research.

[11]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[12]  Lie Lu,et al.  A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[13]  A. Torralba,et al.  Fixations on low-resolution images. , 2010, Journal of vision.

[14]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[15]  Eugene S. Edgington,et al.  Randomization Tests , 2011, International Encyclopedia of Statistical Science.