Video Object Retrieval by Trajectory and Appearance

The prevalence of video recording capability, either on surveillance systems or mobile devices, has contributed to the popularity of video data. As a result, video management has become relatively more important than before and, in particular, video retrieval has been one of the main issues in this regard. Traditional video retrieval systems take texts as the inputs to look for similar information from the title, annotation or embedded textual data of a video, in a way that is very similar to the keyword search adopted by a common search engine. However, the lack of visual information specification during a search often makes the result rather inaccurate or even useless. For this reason, video retrieval systems using images or videos as the inputs have also been proposed; nevertheless, the associated ambiguity and complexity have made the implementation of such systems relatively difficult and, therefore, those systems are not as successful as desired. To address this, in this paper, we propose to perform a video retrieval of a desired object through the inputs of its trajectory and/or appearance, together with the help of a 3-D graphical user interface for more intuitive interactions, so that more satisfactory results can be achieved. We firmly believe that such a framework could serve as the foundation for behavior analysis used in many surveillance systems.

[1]  Shih-Fu Chang,et al.  A fully automated content-based video search engine supporting spatiotemporal queries , 1998, IEEE Trans. Circuits Syst. Video Technol..

[2]  Pietro Perona,et al.  Bag of Words for Large Scale Object Recognition - Properties and Benchmark , 2011, VISAPP.

[3]  George Kollios,et al.  Extraction and clustering of motion trajectories in video , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[4]  Robert B. Fisher,et al.  Semi-supervised Learning for Anomalous Trajectory Detection , 2008, BMVC.

[5]  Motaz El-Saban,et al.  Object matching using feature aggregation over a frame sequence , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[6]  Alessandro Perina,et al.  Multiple-shot person re-identification by chromatic and epitomic analyses , 2012, Pattern Recognit. Lett..

[7]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[8]  Alper Yilmaz,et al.  Object Tracking by Asymmetric Kernel Mean Shift with Automatic Scale and Orientation Selection , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  David S. Doermann,et al.  Video retrieval using spatio-temporal descriptors , 2003, MULTIMEDIA '03.

[10]  Fadi Dornaika,et al.  Efficient Object Detection and Matching Using Feature Classification , 2010, 2010 20th International Conference on Pattern Recognition.

[11]  Luís Corte-Real,et al.  Video object matching across multiple independent views using local descriptors and adaptive learning , 2009, Pattern Recognit. Lett..

[12]  Shih-Fu Chang,et al.  Motion trajectory matching of video objects , 1999, Electronic Imaging.

[13]  Jose Antonio,et al.  Moving object detection and tracking system : a real-time implementation , 1997 .

[14]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[15]  Fabien Moutarde,et al.  Person re-identification in multi-camera system by signature based on interest point descriptors collected on short video sequences , 2008, 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras.

[16]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[17]  Harry Shum,et al.  Interactive Offline Tracking for Color Objects , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18]  Özgür Ulusoy,et al.  Bilvideo-7: an MPEG-7- compatible video indexing and retrieval system , 2010 .

[19]  Murat Kunt,et al.  A master-slave approach for object detection and matching with fixed and mobile cameras , 2008, 2008 15th IEEE International Conference on Image Processing.

[20]  Herbert Freeman,et al.  On the Encoding of Arbitrary Geometric Configurations , 1961, IRE Trans. Electron. Comput..

[21]  J. Crowley,et al.  CAVIAR Context Aware Vision using Image-based Active Recognition , 2005 .

[22]  Vittorio Murino,et al.  Custom Pictorial Structures for Re-identification , 2011, BMVC.

[23]  A. Leonardis,et al.  On-line Conservative Learning for Person Detection , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[24]  Youfu Li,et al.  Mixed Signature: An Invariant Descriptor for 3D Motion Trajectory Perception and Recognition , 2012 .

[25]  Brian C. Lovell,et al.  Improved Shadow Removal for Robust Person Tracking in Surveillance Scenarios , 2010, 2010 20th International Conference on Pattern Recognition.

[26]  Xu Chen,et al.  Robust null space representation and sampling for view-invariant motion trajectory analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[28]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[29]  Dan Schonfeld,et al.  Real-Time Motion Trajectory-Based Indexing and Retrieval of Video Sequences , 2007, IEEE Transactions on Multimedia.

[30]  Y. F. Li,et al.  Mixed signature descriptor with global invariants for 3D motion trajectory perception and recognition , 2010, 2010 IEEE International Conference on Industrial Engineering and Engineering Management.

[31]  Stuart J. Russell,et al.  Image Segmentation in Video Sequences: A Probabilistic Approach , 1997, UAI.

[32]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[33]  Andrew Zisserman,et al.  Efficient Visual Search of Videos Cast as Text Retrieval , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Ajay Divakaran,et al.  MPEG-7 visual motion descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[35]  Slawomir Bak,et al.  Person Re-identification Using Haar-based and DCD-based Signature , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[36]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Fahad Shahbaz Khan,et al.  Fusing Color and Shape for Bag-of-Words Based Object Recognition , 2013, CCIW.

[38]  Takashi Miyoshi,et al.  B-Spline Curve Fitting onto Measured Point Data under Consideration of Curvature. , 1994 .

[39]  Ying Wu,et al.  Distributed data association and filtering for multiple target tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Mubarak Shah,et al.  Multi feature path modeling for video surveillance , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[41]  Qixiang Ye,et al.  Abnormal Behavior Detection via Sparse Reconstruction Analysis of Trajectory , 2011, 2011 Sixth International Conference on Image and Graphics.

[42]  R. Hunter Photoelectric Color Difference Meter , 1958 .

[43]  David C. Hogg,et al.  On the feasibility of using a cognitive model to filter surveillance data , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[44]  Jun-Wei Hsieh,et al.  Motion-based video retrieval by trajectory matching , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[45]  Yonggwan Won,et al.  A Cost Effective Method for Matching the 3D Motion Trajectories , 2012, ICITCS.

[46]  Rongchun Zhao,et al.  Trajectory Matching and Classification of Video Moving Objects , 2005, 2005 IEEE 7th Workshop on Multimedia Signal Processing.

[47]  Zoran Zivkovic,et al.  Improved adaptive Gaussian mixture model for background subtraction , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[48]  Allan Hanbury,et al.  Co-occurrence Bag of Words for Object Recognition , 2010 .

[49]  Jiri Matas,et al.  Online learning of robust object detectors during unstable tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[50]  Takeo Kanade,et al.  Image matching in large scale indoor environment , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[51]  H. Grabner,et al.  Is Pedestrian Detection Really a Hard Task ? ∗ , 2007 .

[52]  Dan Roth,et al.  Learning to detect objects in images via a sparse, part-based representation , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Hoai Bac Le,et al.  GPU Implementation of Extended Gaussian Mixture Model for Background Subtraction , 2010, 2010 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future (RIVF).

[54]  Dan Schonfeld,et al.  View-invariant motion trajectory-based activity classification and recognition , 2006, Multimedia Systems.

[55]  Xu Chen,et al.  Motion Trajectory-Based Video Retrieval, Classification, and Summarization , 2010, Video Search and Mining.

[56]  Soraia Raupp Musse,et al.  Background Subtraction and Shadow Detection in Grayscale Video Sequences , 2005, XVIII Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI'05).