A Robust Instance-based Video Search Method by Re-ranking for Multiple Visual Codebooks

With the development of Internet and computer technology, the digital videos increase explosively, so how to get the interesting video clips from the massive video dataset quickly and efficiently has become an urgent problem in the field of information retrieval (IR). In this paper, a new Instance-based video search (INS) engine is proposed to solve the urgent problem. Firstly, video key-frames are extracted and grouped from queries and video dataset. Secondly, robust low-level visual features are extracted and projected to multiple visual codebooks. Finally, after similarity computing and feature fusion, a re-ranking scheme is implemented to vote the final search results. The proposed framework was evaluated at TRECVID 2011 on instance search task(INS), and achieved the 2nd place among 47 participants around the world, which indicated the effectiveness of our system.

[1]  Tao Liu,et al.  A new framework for high-level feature extraction , 2009, 2009 4th IEEE Conference on Industrial Electronics and Applications.

[2]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[3]  Aly A. Farag,et al.  CSIFT: A SIFT Descriptor with Color Invariant Characteristics , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[5]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[6]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[7]  Thomas S. Huang,et al.  Content-based image retrieval with relevance feedback in MARS , 1997, Proceedings of International Conference on Image Processing.

[8]  Werner Bailer,et al.  JOANNEUM RESEARCH and Vienna University of Technology at TRECVID 2010 , 2010, TRECVID.

[9]  Tao Liu,et al.  BUPT at TRECVID 2007: Shot Boundary Detection , 2007, TRECVID.

[10]  Hui Zhang,et al.  BUPT-MCPRL at TRECVID 2009 , 2009, TRECVID.

[11]  Xiaohui Xie,et al.  A novel framework for semantic-based video retrieval , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[12]  Xiongfei Li,et al.  Multimodal Image Retrieval Based on Annotation Keywords and Visual Content , 2009, 2009 IITA International Conference on Control, Automation and Systems Engineering (case 2009).

[13]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[14]  Duy-Dinh Le,et al.  National Institute of Informatics, Japan at TRECVID 2008 , 2008, TRECVID.