Superpixel Labeling Priors and MRF for Aerial Video Segmentation

Video segmentation is a task of partitioning pixels that exhibit homogeneous appearance and motion into coherent spatial–temporal groups, which is still challenging for aerial applications. In this paper, a principled combination of superpixel labeling priors and Markov random field (S-MRF) is proposed for aerial video segmentation. The proposed approach has several contributions: 1) we develop a metadata-based global projection model with coordinate transformation to estimate motion information between frames; 2) the superpixel labeling priors from previous frames are incorporated into the segmentation of the current frame, leading to a highly efficient probabilistic label propagation algorithm; and 3) we perform an MRF optimization on the initial segments with propagated labeling priors to improve the temporal coherency. In addition, a new video dataset is collected and will be made publicly available to evaluate the performance of aerial video segmentation algorithms. The experimental results show that the proposed approach outperforms the state-of-the-art video segmentation methods.

[1]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Scenes and Its Applications , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  C. Lawrence Zitnick,et al.  Fast Edge Detection Using Structured Forests , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Mei Han,et al.  Efficient hierarchical graph-based video segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Gerard Medioni,et al.  Motion propagation detection association for multi-target tracking in wide area aerial surveillance , 2015, 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[5]  Michael Teutsch,et al.  Evaluation of object segmentation to improve moving vehicle detection in aerial videos , 2014, 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[6]  Eric L. Miller,et al.  Multiple Hypothesis Video Segmentation from Superpixel Flows , 2010, ECCV.

[7]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Larry S. Davis,et al.  AVSS 2011 demo session: A large-scale benchmark dataset for event recognition in surveillance video , 2011, AVSS.

[9]  Uwe Stilla,et al.  SEMANTIC SEGMENTATION OF AERIAL IMAGES WITH AN ENSEMBLE OF CNNS , 2016 .

[10]  Cordelia Schmid,et al.  DeepFlow: Large Displacement Optical Flow with Deep Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Martial Hebert,et al.  Motion Words for Videos , 2014, ECCV.

[13]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Yu Liu,et al.  Hourglass-ShapeNetwork Based Semantic Segmentation for High Resolution Aerial Imagery , 2017, Remote. Sens..

[15]  Markus H. Gross,et al.  Fully Connected Object Proposals for Video Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Chenliang Xu,et al.  LIBSVX: A Supervoxel Library and Benchmark for Early Video Processing , 2015, International Journal of Computer Vision.

[17]  Impyeong Lee,et al.  Translation-Based KLT Tracker Under Severe Camera Rotation Using GPS/INS Data , 2014, IEEE Geoscience and Remote Sensing Letters.

[18]  Thomas Brox,et al.  A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis , 2013, 2013 IEEE International Conference on Computer Vision.

[19]  Guna Seetharaman,et al.  Semantic Depth Map Fusion for Moving Vehicle Detection in Aerial Video , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[20]  Chenliang Xu,et al.  Flattening Supervoxel Hierarchies by the Uniform Entropy Slice , 2013, 2013 IEEE International Conference on Computer Vision.

[21]  Philippe Salembier,et al.  Hierarchical Video Representation with Trajectory Binary Partition Tree , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Fatih Murat Porikli,et al.  Saliency-aware geodesic video object segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Luc Van Gool,et al.  A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Bohyung Han,et al.  Online Video Segmentation by Bayesian Split-Merge Clustering , 2012, ECCV.

[25]  Chenliang Xu,et al.  Streaming Hierarchical Video Segmentation , 2012, ECCV.

[26]  Hongguang Li,et al.  Metadata-Assisted Global Motion Estimation for Medium-Altitude Unmanned Aerial Vehicle Video Applications , 2015, Remote. Sens..

[27]  Hui Cheng,et al.  Segmentation of Aerial Surveillance Video Using a Mixture of Experts , 2005, Digital Image Computing: Techniques and Applications (DICTA'05).

[28]  Tinghuai Wang,et al.  Probabilistic Motion Diffusion of Labeling Priors for Coherent Video Segmentation , 2012, IEEE Transactions on Multimedia.

[29]  Thomas Brox,et al.  Spectral Graph Reduction for Efficient Image and Streaming Video Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  John W. Fisher,et al.  A Video Representation Using Temporal Superpixels , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Bernt Schiele,et al.  Classifier based graph construction for video segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Hongguang Li,et al.  Image Registration and Fusion of Visible and Infrared Integrated Camera for Medium-Altitude Unmanned Aerial Vehicle Remote Sensing , 2017, Remote. Sens..

[33]  Bingbing Ni,et al.  Video Segmentation via Multiple Granularity Analysis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Guoqing Zhou Near Real-Time Orthorectification and Mosaic of Small UAV Video Flow for Time-Critical Event Response , 2009, IEEE Trans. Geosci. Remote. Sens..

[35]  Bernt Schiele,et al.  Video Segmentation with Superpixels , 2012, ACCV.

[36]  Jean-Yves Guillemaut,et al.  Multi-label propagation for coherent video segmentation and artistic stylization , 2010, 2010 IEEE International Conference on Image Processing.

[37]  Bodo Rosenhahn,et al.  Temporally Consistent Superpixels , 2013, 2013 IEEE International Conference on Computer Vision.

[38]  Narendra Ahuja,et al.  Exploiting nonlocal spatiotemporal structure for video segmentation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Irfan A. Essa,et al.  Geometric Context from Videos , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Guna Seetharaman,et al.  Spatial pyramid context-aware moving vehicle detection and tracking in urban aerial imagery , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[41]  Vladimir Pavlovic,et al.  Multi-cue Structure Preserving MRF for Unconstrained Video Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[42]  Gonzalo Pajares,et al.  Overview and Current Status of Remote Sensing Applications Based on Unmanned Aerial Vehicles (UAVs) , 2015 .

[43]  René Vidal,et al.  Coarse-to-Fine Semantic Video Segmentation Using Supervoxel Trees , 2013, 2013 IEEE International Conference on Computer Vision.

[44]  Shuicheng Yan,et al.  SOLD: Sub-optimal low-rank decomposition for efficient video segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).