Coupled ensemble graph cuts and object verification for animal segmentation from highly cluttered videos

In this paper, we consider animal object segmentation from wildlife monitoring videos captured by motion-triggered cameras, called camera-traps. This is a very challenging task because the wildlife monitoring scenes are often highly cluttered and dynamic. To address this issue, we propose to explore the ideas of coupled ensemble graph cuts and object verification. We consider video object cut as an ensemble of frame-level background-foreground object classifiers which fuse information across frames and refine their segmentation results in a collaborative and iterative manner. To significantly reduce false positives in foreground animal detection and segmentation, we learn an object verification model to further classify if the segmented image patch belongs to the background or the animal. Our extensive experimental results and performance comparisons over a diverse set of challenging camera-trap data, as well as the new Change Detection 2014 benchmark dataset, demonstrate that the proposed framework outperforms various state-of-the-art algorithms and has the capability to handle even the most challenging objects in a wide variety of video sequences.

[1]  Davide Brunelli,et al.  Wireless Sensor Networks , 2012, Lecture Notes in Computer Science.

[2]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[3]  Ferdinand van der Heijden,et al.  Efficient adaptive density estimation per image pixel for the task of background subtraction , 2006, Pattern Recognit. Lett..

[4]  Tony X. Han,et al.  Ensemble Video Object Cut in Highly Dynamic Scenes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Yaser Sheikh,et al.  Bayesian modeling of dynamic scenes for object detection , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[7]  Nuno Vasconcelos,et al.  Spatiotemporal Saliency in Dynamic Scenes , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Zhihai He,et al.  Monitoring wild animal communities with arrays of motion sensitive camera traps , 2010, ArXiv.

[9]  Deborah Estrin,et al.  Background Subtraction on Distributions , 2008, ECCV.

[10]  Rui Wang,et al.  Static and Moving Object Detection Using Flux Tensor with Split Gaussian Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[11]  Guillaume-Alexandre Bilodeau,et al.  Flexible Background Subtraction with Self-Balanced Local Sensitivity , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[12]  Bin Wang,et al.  A Fast Self-Tuning Background Subtraction Algorithm , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[13]  Nuno Vasconcelos,et al.  Background subtraction in highly dynamic scenes , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Massimo De Gregorio,et al.  Change Detection with Weightless Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[15]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Fatih Murat Porikli,et al.  Changedetection.net: A new change detection benchmark dataset , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[17]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[18]  Bingpeng Ma,et al.  Local Descriptors Encoded by Fisher Vectors for Person Re-identification , 2012, ECCV Workshops.

[19]  Chin-Seng Chua,et al.  Motion detection with nonstationary background , 2003, Machine Vision and Applications.