Static and dynamic feature-based visual attention model: Comparison to human judgment

In this paper, a novel bottom-up visual attention model is proposed. By using static and dynamic features, we determine salient areas in video scenes. The model is characterized by the fusion of spatial information and moving object detection. The static model, inspired by the human system, is achieved by a retinal filtering followed by a cortical decomposition. The dynamic model is carried out by an estimation and a compensation of camera motion. Although several approaches to visual attention were developed in various applications, few compared their model to human perception. A psychophysical experiment is then presented to compare our model with human perception and to validate it. The results provide a quantitative analysis and show the efficiency of this approach.

[1]  Xing Xie,et al.  Image Adaptation Based on Attention Model for Small-Form-Factor Device , 2003, MMM.

[2]  Wen-Huang Cheng,et al.  A user-attention based focus detection framework and its applications , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[3]  Bruno Arnaldi,et al.  A new application for saliency maps: synthetic vision of autonomous actors , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[4]  Jean-Marc Odobez,et al.  Robust Multiresolution Estimation of Parametric Motion Models , 1995, J. Vis. Commun. Image Represent..

[5]  W. Beaudot,et al.  Sensory coding in the vertebrate retina: towards an adaptive control of visual sensitivity. , 1996, Network.

[6]  C. Koch,et al.  Target detection using saliency-based attention , 2000 .

[7]  Jeanny Hérault,et al.  NATURAL SCENE PERCEPTION: VISUAL ATTRACTORS AND IMAGES PROCESSING , 2002 .