Propagating multi-class pixel labels throughout video frames

The effective propagation of pixel labels through the spatial and temporal domains is vital to many computer vision and multimedia problems, yet little attention have been paid to the temporal/video domain propagation in the past. Previous video label propagation algorithms largely avoided the use of dense optical flow estimation due to their computational costs and inaccuracies, and relied heavily on complex (and slower) appearance models. We show in this paper the limitations of pure motion and appearance based propagation methods alone, especially the fact that their performances vary on different type of videos. We propose a probabilistic framework that estimates the reliability of the sources and automatically adjusts the weights between them. Our experiments show that the “dragging effect” of pure optical-flow-based methods are effectively avoided, while the problems of pure appearance-based methods such the large intra-class variance is also effectively handled.

[1]  Guillermo Sapiro,et al.  Geodesic Matting: A Framework for Fast Interactive Image and Video Segmentation and Matting , 2009, International Journal of Computer Vision.

[2]  Michael F. Cohen,et al.  An iterative optimization approach for unified image segmentation and matting , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[3]  Roberto Cipolla,et al.  Semantic object classes in video: A high-definition ground truth database , 2009, Pattern Recognit. Lett..

[4]  Guillermo Sapiro,et al.  A Geodesic Framework for Fast Interactive Image and Video Segmentation and Matting , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Guillermo Sapiro,et al.  Video SnapCut: robust video object cutout using localized classifiers , 2009, SIGGRAPH 2009.

[6]  David Salesin,et al.  Video matting of complex scenes , 2002, SIGGRAPH.

[7]  W. Freeman,et al.  Generalized Belief Propagation , 2000, NIPS.

[8]  Richard Szeliski,et al.  A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Michael J. Black,et al.  The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[10]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[11]  Michael J. Swain,et al.  Indexing via color histograms , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[12]  A. Criminisi,et al.  Bilayer Segmentation of Live Video , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[14]  Roberto Cipolla,et al.  Label propagation in video sequences , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.