Video saliency based on rarity prediction: Hyperaptor

Saliency models are able to provide heatmaps highlighting areas in images which attract human gaze. Most of them are designed for still images but an increasing trend goes towards an extension to videos by adding dynamic features to the models. Nevertheless, only few are specifically designed to manage the temporal aspect. We propose a new model which quantifies the rarity natively in a spatiotemporal way. Based on a sliding temporal window, static and dynamic features are summarized by a time evolving "surface" of different features statistics, that we call the "hyperhistogram". The rarity-maps obtained for each feature are combined with the result of a superpixel algorithm to have a more object-based orientation. The proposed model, Hyperaptor stands for hyperhistogram-based rarity prediction. The model is evaluated on a dataset of 12 videos with 2 different references along 3 different metrics. It is shown to achieve better performance compared to state-of-the-art models.

[1]  Christof Koch,et al.  Comparison of feature combination strategies for saliency-based visual attention systems , 1999, Electronic Imaging.

[2]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[3]  Janusz Konrad,et al.  Video Condensation by Ribbon Carving , 2009, IEEE Transactions on Image Processing.

[4]  Frédéric Dufaux,et al.  Improved seam carving for semantic video cod , 2012, 2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP).

[5]  Hongyu Li,et al.  SDSP: A novel saliency detection method by combining simple priors , 2013, 2013 IEEE International Conference on Image Processing.

[6]  Bobby Bodenheimer,et al.  Synthesis and evaluation of linear motion transitions , 2008, TOGS.

[7]  Nathalie Guyader,et al.  Modelling Spatio-Temporal Saliency to Predict Gaze Direction for Short Videos , 2009, International Journal of Computer Vision.

[8]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[9]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[10]  Matei Mancas,et al.  Image perception : Relative influence of bottom-up and top-down attention , 2008 .

[11]  Nicolas Riche,et al.  Spatio-temporal saliency based on rare model , 2013, 2013 IEEE International Conference on Image Processing.

[12]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[13]  Ivan V. Bajic,et al.  Eye-Tracking Database for a Set of Standard Video Sequences , 2012, IEEE Transactions on Image Processing.

[14]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  John Hannah,et al.  IEEE International Conference on Image Processing (ICIP) , 1997 .

[16]  Nicolas Riche,et al.  Rare: A new bottom-up saliency model , 2012, 2012 19th IEEE International Conference on Image Processing.

[17]  Ariel Shamir,et al.  Improved seam carving for video retargeting , 2008, SIGGRAPH 2008.

[18]  Shijian Lu,et al.  Saliency Modeling from Image Histograms , 2012, ECCV.

[19]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[20]  Esa Rahtu,et al.  Segmenting Salient Objects from Images and Videos , 2010, ECCV.

[21]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[22]  Nicolas Riche,et al.  Abnormal motion selection in crowds using bottom-up saliency , 2011, 2011 18th IEEE International Conference on Image Processing.