Automatic extraction of salient objects in 3D stereoscopic videos

For 3D stereoscopic videos the depth perception represents an important factor that affects the human visual attention much more than any motion or texture contrast existent in a traditional 2D videos. In this context, the present paper addressed the issue of stereoscopic visual attention models designed to detect salient objects in 3D videos. We propose representing the image sequence as a 2D video stream and its associated depth maps. The technique starts by combining a spatiotemporal attention model with a disparity map. The depth map offers important information about the objects position in space and helps us estimating their relative distance to the video camera. The proposed method is evaluated on a set of ten 3D video streams and can be considered efficient and robust.

[1]  Sheng-Wen Shih,et al.  Dynamic Visual Saliency Modeling for Video Semantics , 2008, 2008 International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[2]  Gye-Young Kim,et al.  Robust Estimation of Camera Homography Using Fuzzy RANSAC , 2007, ICCSA.

[3]  Ivan Laptev,et al.  Track to the future: Spatio-temporal video segmentation with long-range motion cues , 2011, CVPR 2011.

[4]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[5]  Özgür Ulusoy,et al.  Automatic detection of salient objects and spatial relations in videos for a video database system , 2008, Image Vis. Comput..

[6]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Sabine Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Hailin Jin,et al.  Stereo matching with nonparametric smoothness priors in feature space , 2009, CVPR.

[9]  Wonjun Kim,et al.  Spatiotemporal Saliency Detection and Its Applications in Static and Dynamic Scenes , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[11]  Yu Huang,et al.  Video retargeting with nonlinear spatial-temporal saliency fusion , 2010, 2010 IEEE International Conference on Image Processing.

[12]  Ken Chen,et al.  Stereoscopic Visual Attention Model for 3D Video , 2010, MMM.

[13]  King Ngi Ngan,et al.  Motion trajectory based visual saliency for video quality assessment , 2011, 2011 18th IEEE International Conference on Image Processing.

[14]  Ying-li Tian,et al.  Robust Salient Motion Detection with Complex Background for Real-Time Video Surveillance , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[15]  Gang Hua,et al.  Efficient Scale-Space Spatiotemporal Saliency Tracking for Distortion-Free Video Retargeting , 2009, ACCV.

[16]  Mubarak Shah,et al.  Visual attention detection in video sequences using spatiotemporal cues , 2006, MM '06.

[17]  Faouzi Alaya Cheikh,et al.  Predictive visual saliency model for surveillance video , 2011, 2011 19th European Signal Processing Conference.

[18]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[19]  Xiaochun Cao,et al.  Motion saliency detection using low-rank and sparse decomposition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Ruxandra Tapu,et al.  Salient object detection based on spatiotemporal attention models , 2013, 2013 IEEE International Conference on Consumer Electronics (ICCE).