Reduce false positives for object detection by a priori probability in videos

In this work, we address the problem of reducing the false positives for object detection in videos. We employ the motion cue to build a foreground probability model. Then the mean expectation of the pixel-level foreground probability is computed to assign a priori probability to the sliding window in detection. The proposed foreground model is evaluated with the detection framework of Deformable Part Models (DPM). We combine the response of DPM detector and the mean probability expectation to form the features and train a linear classifier. The proposed approach is threshold-free, and reduces the false positives in object detection by the foreground cues. Besides, we describe an integral probability image for fast computation of the mean probability expectation. Experimental results show that the proposed method achieve superior performance over the baseline of Deformable Part Models.

[1]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[2]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  F. Xavier Roca,et al.  Exploiting multiple cues in motion segmentation based on background subtraction , 2013, Neurocomputing.

[4]  Z. M. Hefed Object tracking , 1999 .

[5]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[6]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[7]  Marc Van Droogenbroeck,et al.  ViBe: A Universal Background Subtraction Algorithm for Video Sequences , 2011, IEEE Transactions on Image Processing.

[8]  Nicu Sebe,et al.  Event Oriented Dictionary Learning for Complex Event Detection , 2015, IEEE Transactions on Image Processing.

[9]  Subramanian Ramanathan,et al.  A Multi-Task Learning Framework for Head Pose Estimation under Target Motion , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[11]  Afshin Dehghan,et al.  Part-based multiple-person tracking with partial occlusion handling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  CipollaRoberto,et al.  Semantic object classes in video , 2009 .

[13]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[14]  Roberto Cipolla,et al.  Segmentation and Recognition Using Structure from Motion Point Clouds , 2008, ECCV.

[15]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16]  Cordelia Schmid,et al.  Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.

[17]  Xiaowei Zhou,et al.  Moving Object Detection by Detecting Contiguous Outliers in the Low-Rank Representation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  David Suter,et al.  A Novel Robust Statistical Method for Background Initialization and Visual Surveillance , 2006, ACCV.

[19]  Nicu Sebe,et al.  Egocentric Daily Activity Recognition via Multitask Clustering , 2015, IEEE Transactions on Image Processing.

[20]  Roberto Cipolla,et al.  Semantic object classes in video: A high-definition ground truth database , 2009, Pattern Recognit. Lett..

[21]  Bernt Schiele,et al.  Monocular 3D pose estimation and tracking by detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[23]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[24]  Ming Yang,et al.  Regionlets for Generic Object Detection , 2013, ICCV.

[25]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Dariu Gavrila,et al.  A Bayesian, Exemplar-Based Approach to Hierarchical Shape Matching , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  J. Weickert,et al.  Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods , 2005 .

[28]  Bernt Schiele,et al.  A Performance Evaluation of Single and Multi-feature People Detection , 2008, DAGM-Symposium.

[29]  J. Ferryman,et al.  PETS2009: Dataset and challenge , 2009, 2009 Twelfth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance.

[30]  Daniel Cremers,et al.  Motion Competition: A Variational Approach to Piecewise Parametric Motion Segmentation , 2005, International Journal of Computer Vision.

[31]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[32]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Subhransu Maji,et al.  Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Bernt Schiele,et al.  New features and insights for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.