Long term video segmentation through pixel level spectral clustering on GPUs

We introduce a new technique for performing video segmentation combining the state-of-the-art image segmentation and optical flow algorithms on GPUs. We avoid pre-clustering into superpixels and probabilistic reasoning, and instead view the problem as a generalization of image segmentation techniques. Utilizing spectral clustering techniques at the pixel level (as opposed to 2D/3D superpixels), we demonstrate video segmentation over hundreds of frames - far beyond what has been achieved through pixel level spectral segmentation techniques before. Our algorithm achieves comparable accuracy as other sparse motion clustering techniques while still maintaining 100% density in segmentation over long time periods. We achieve better accuracy with lower oversegmentation compared to dense video segmentation techniques. We exploit increased computational power made available through parallelism in GPUs and efficient numerical algorithms to achieve these results. We show our results on the motion segmentation dataset [4]. Our technique can also be used to provide good quality 3D superpixels and extended to tasks where the ability to track 3D volumes over time is useful.

[1]  Jitendra Malik,et al.  Motion segmentation and tracking using normalized cuts , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[2]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Daniel DeMenthon,et al.  SPATIO-TEMPORAL SEGMENTATION OF VIDEO BY HIERARCHICAL MEAN SHIFT ANALYSIS , 2002 .

[4]  S. Shankar Sastry,et al.  Generalized principal component analysis (GPCA) , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[5]  Marc Pollefeys,et al.  A General Framework for Motion Segmentation: Independent, Articulated, Rigid, Non-rigid, Degenerate and Non-degenerate , 2006, ECCV.

[6]  Martial Hebert,et al.  Learning to Find Object Boundaries Using Motion Cues , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[7]  Jan-Michael Frahm,et al.  Feature tracking and matching in video using programmable graphics hardware , 2007, Machine Vision and Applications.

[8]  Jitendra Malik,et al.  Using contours to detect and localize junctions in natural images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  René Vidal,et al.  Motion segmentation via robust subspace separation in the presence of outlying, incomplete, or corrupted trajectories , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Dimitris N. Metaxas,et al.  ]Video object segmentation by hypergraph cut , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Kurt Keutzer,et al.  Efficient, high-quality image contour detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12]  Daniel Cremers,et al.  Anisotropic Huber-L1 Optical Flow , 2009, BMVC.

[13]  Jitendra Malik,et al.  From contours to regions: An empirical evaluation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  William Brendel,et al.  Video object segmentation by tracking regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Jan-Michael Frahm,et al.  Building Rome on a Cloudless Day , 2010, ECCV.

[16]  Jitendra Malik,et al.  Object Segmentation by Long Term Analysis of Point Trajectories , 2010, ECCV.

[17]  Alan L. Yuille,et al.  Occlusion Boundary Detection Using Pseudo-depth , 2010, ECCV.

[18]  Eric L. Miller,et al.  Multiple Hypothesis Video Segmentation from Superpixel Flows , 2010, ECCV.

[19]  Kurt Keutzer,et al.  Dense Point Trajectories by GPU-Accelerated Large Displacement Optical Flow , 2010, ECCV.

[20]  Mei Han,et al.  Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Jitendra Malik,et al.  Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.