Object Detection with Vocabularies of Space-time Descriptors

This paper presents a novel framework for objects detection in security and broadcast videos. Our method assumes that object classes are unknown in advance and exploit the temporal-space properties of the videos for the creation of a vocabulary that describes these classes. Local space-time features have recently became a popular video representation for action recognition and object detection. Several methods for feature localization and description have been proposed in the literature and promising recognition results were demonstrated for a number of action classes. In this work we propose the use of different kinds of descriptors for the creation of vocabularies for different detection object task. For a better description of the videos we carry out a background model, tryring to clean up and follow the areas where there are objects. The points of interest in the videos to characterize the objects are calculated with a temporary variant of the famous Harris corner detector. With the descriptors obtained from the points of interest, a vocabulary is realized usingthe kinds of videos we want to train. Then we obtained the frequency histograms between the videos for training and the vocabulary so, with a binary classifier obtain the trained classes and following the same procedure without the vocabulary realized the detection and monitoring of the objects. The new method presented is also compared with a state of the art method, obtaining better results in both accuracy and false object rejection.

[1]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2]  Chein-I Chang,et al.  Unsupervised approach to color video thresholding , 2004 .

[3]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[5]  Pietro Perona,et al.  A Probabilistic Approach to Object Recognition Using Local Photometry and Global Geometry , 1998, ECCV.

[6]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[7]  Luc Van Gool,et al.  An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.

[8]  Pietro Perona,et al.  Unsupervised learning of models for object recognition , 2000 .

[9]  Chein-I Chang,et al.  An unsupervised approach to color video thresholding , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[10]  Juergen Gall,et al.  Class-specific Hough forests for object detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[12]  Colin Campbell,et al.  Bayes Point Machines , 2001, J. Mach. Learn. Res..

[13]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[15]  Ivan Laptev,et al.  Improving object detection with boosted histograms , 2009, Image Vis. Comput..

[16]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[18]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[19]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.