Bi-layer segmentation from stereo video sequences by fusing multiple cues

Bi-layer video segmentation (segmentation of videos into foreground layer and background layer) has attracted a lot of research interests recently. The algorithm can be applied into many vision and multimedia applications, such as human computer interaction, gesture recognition, object detection/tracking and personal video editing. Traditional approaches cluster pixels into homogenous regions based on color distributions or motion patterns. However, spatial or temporal clues alone are insufficient to distinguish objects because different objects may share similar colors or motions. Nowadays, the availability of stereo cameras provides the possibility to recover certain 3D depth information from videos. In this paper, we propose bi-layer segmentation from stereo video sequences by fusing multiple cues, including 3D depth information, color/texture distribution, and motion vectors. We also explore temporal-spatial coherence among a consecutive sequence of frames in order to reduce noises from a single frame.

[1]  A. Criminisi,et al.  Bilayer Segmentation of Live Video , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2]  Carlo Tomasi,et al.  Depth Discontinuities by Pixel-to-Pixel Stereo , 1999, International Journal of Computer Vision.

[3]  Irfan A. Essa,et al.  Tree-based Classifiers for Bilayer Video Segmentation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Mubarak Shah,et al.  Human tracking in multiple cameras , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[5]  Andrew Blake,et al.  Bi-layer segmentation of binocular stereo video , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Wei Xiong,et al.  Moving Object Extraction with a Hand-held Camera , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[7]  Harry Shum,et al.  Background Cut , 2006, ECCV.

[8]  Peter Hillman Segmentation of motion picture images and image sequences , 2002 .

[9]  Larry S. Davis,et al.  Real-time foreground-background segmentation using codebook model , 2005, Real Time Imaging.

[10]  Frank Nielsen,et al.  Statistical region merging , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  David Renshaw,et al.  Segmentation of motion picture images , 2003 .