Video attention: Learning to detect a salient object sequence

We study video attention by detecting a salient object sequence from video segment. We formulate salient object sequence detection as energy minimization problem in a conditional random field framework, while static and dynamic salience, spatial and temporal coherence, global topic model are well defined and integrated to identify a salient object sequence. Dynamic programming algorithm is designed to resolve a global optimization, with a rectangle to represent each salient object. We validate our approach on a large number of video segments with the labeled salient object sequence.

[1]  Michael Gleicher,et al.  Video retargeting: automating pan and scan , 2006, MM '06.

[2]  L. Itti,et al.  Visual causes versus correlates of attentional selection in dynamic scenes , 2006, Vision Research.

[3]  Marc Parizeau,et al.  Incremental discovery of object parts in video sequences , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[4]  HongJiang Zhang,et al.  A model of motion attention for video skimming , 2002, Proceedings. International Conference on Image Processing.

[5]  Nebojsa Jojic,et al.  Escaping local minima through hierarchical model selection: Automatic object discovery, segmentation, and tracking in video , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Alexandre Bur,et al.  Dynamic visual attention: competitive versus motion priority scheme , 2007, ICVS 2007.

[7]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[9]  Mubarak Shah,et al.  Visual attention detection in video sequences using spatiotemporal cues , 2006, MM '06.

[10]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[11]  Tsuhan Chen,et al.  A Topic-Motion Model for Unsupervised Video Object Discovery , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Jian Sun,et al.  Video object cut and paste , 2005, SIGGRAPH 2005.

[13]  Pierre Baldi,et al.  A principled approach to detecting surprising events in video , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Eero P. Simoncelli,et al.  Differentiation of Discrete Multi-Dimensional Signals , 2004 .

[15]  Eero P. Simoncelli,et al.  Differentiation of discrete multidimensional signals , 2004, IEEE Transactions on Image Processing.

[16]  Eero P. Simoncelli Design of multi-dimensional derivative filters , 1994, Proceedings of 1st International Conference on Image Processing.

[17]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..