Saliency in Crowd

Theories and models on saliency that predict where people look at focus on regular-density scenes. A crowded scene is characterized by the co-occurrence of a relatively large number of regions/objects that would have stood out if in a regular scene, and what drives attention in crowd can be significantly different from the conclusions in the regular setting. This work presents a first focused study on saliency in crowd. To facilitate saliency in crowd study, a new dataset of 500 images is constructed with eye tracking data from 16 viewers and annotation data on faces (the dataset will be publicly available with the paper). Statistical analyses point to key observations on features and mechanisms of saliency in scenes with different crowd levels and provide insights as of whether conventional saliency models hold in crowding scenes. Finally a new model for saliency prediction that takes into account the crowding information is proposed, and multiple kernel learning (MKL) is used as a core computational module to integrate various features at both low- and high-levels. Extensive experiments demonstrate the superior performance of the proposed model compared with the state-of-the-art in saliency computation.

[1]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[2]  Peyman Milanfar,et al.  Static and space-time visual saliency detection by self-resemblance. , 2009, Journal of vision.

[3]  Matei Mancas Attention-based dense crowds analysis , 2010, 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10.

[4]  Christof Koch,et al.  Learning visual saliency by combining feature maps in a nonlinear manner using AdaBoost. , 2012, Journal of vision.

[5]  Bernhard Schölkopf,et al.  A Nonparametric Approach to Bottom-Up Visual Saliency , 2006, NIPS.

[6]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Antón García-Díaz,et al.  Saliency from hierarchical adaptation through decorrelation and variance normalization , 2012, Image Vis. Comput..

[8]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[9]  Heinz Hügli,et al.  Empirical Validation of the Saliency-based Model of Visual Attention , 2003 .

[10]  Lihi Zelnik-Manor,et al.  What Makes a Patch Distinct? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  P. Perona,et al.  Objects predict fixations better than early saliency. , 2008, Journal of vision.

[13]  Christof Koch,et al.  Predicting human gaze using low-level saliency combined with face detection , 2007, NIPS.

[14]  C. Koch,et al.  Faces and text attract gaze independent of the task: Experimental data and computer model. , 2009, Journal of vision.

[15]  S. Treue Neural correlates of attention in primate visual cortex , 2001, Trends in Neurosciences.

[16]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[17]  Nuno Vasconcelos,et al.  Anomaly Detection and Localization in Crowded Scenes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[19]  Nuno Vasconcelos,et al.  The discriminant center-surround hypothesis for bottom-up saliency , 2007, NIPS.

[20]  Asha Iyer,et al.  Components of bottom-up gaze allocation in natural images , 2005, Vision Research.

[21]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[22]  Iain D. Gilchrist,et al.  Visual correlates of fixation selection: effects of scale and time , 2005, Vision Research.

[23]  Alan C Bovik,et al.  Contrast statistics for foveated visual systems: fixation selection by minimizing contrast entropy. , 2005, Journal of the Optical Society of America. A, Optics, image science, and vision.

[24]  Qi Zhao,et al.  Leveraging Human Fixations in Sparse Coding: Learning a Discriminative Dictionary for Saliency Prediction , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[25]  Nuno Vasconcelos,et al.  On the plausibility of the discriminant center-surround hypothesis for visual saliency. , 2008, Journal of vision.

[26]  Nuno Vasconcelos,et al.  Spatiotemporal Saliency in Dynamic Scenes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  J. Henderson,et al.  Object-based attentional selection in scene viewing. , 2010, Journal of vision.

[28]  David J. Field,et al.  What Is the Goal of Sensory Coding? , 1994, Neural Computation.

[29]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[30]  Christof Koch,et al.  Learning a saliency map using fixated locations in natural scenes. , 2011, Journal of vision.

[31]  Benjamin W Tatler,et al.  The central fixation bias in scene viewing: selecting an optimal viewing position independently of motor biases and image feature distributions. , 2007, Journal of vision.

[32]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[33]  Christof Koch,et al.  Image Signature: Highlighting Sparse Salient Regions , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[35]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[36]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[37]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[38]  T. Poggio,et al.  What and where: A Bayesian inference theory of attention , 2010, Vision Research.

[39]  Lihi Zelnik-Manor,et al.  Learning Video Saliency from Human Gaze Using Candidate Selection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Shuo Wang,et al.  Predicting human gaze beyond pixels. , 2014, Journal of vision.

[41]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.