An Efficient Approach based on Image Pixel and Semantic Features Towards Video Retrieval

With the development of image vision technology, video data emerges in large numbers. Even though varieties of methods have achieved excellent performance, how to quickly and accurately retrieve the video needed in a large number of videos is still of vital importance but challenging. In this paper we propose a video retrieval method that combines keyframe of pixels and semantic features. We extract the pixels and semantic features of input image during retrieval, and obtain the shot candidate set by matching the pixel features. Besides we sort the candidate set semantic features of the centralized keyframe. Finally we return the video and the shots position. On this basis, an image-based video retrieval tool was designed and implemented. The effectiveness of the tool were verified in related datasets and practical applications.

[1]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[2]  Svetlana Lazebnik,et al.  Where to Buy It: Matching Street Clothing Photos in Online Shops , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Jen-Hao Hsiao,et al.  Deep learning of binary hash codes for fast image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[5]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[6]  Ling-Yu Yan,et al.  Convolutional neural codes for image retrieval , 2014, Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific.

[7]  Khalid Saeed,et al.  Biometrics, Computer Security Systems and Artificial Intelligence Applications , 2006 .

[8]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[9]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[10]  Qiang Liu,et al.  Image-Based Video Retrieval Using Deep Feature , 2017, 2017 IEEE International Conference on Smart Computing (SMARTCOMP).

[11]  Tieniu Tan,et al.  Deep semantic ranking based hashing for multi-label image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[13]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[14]  Hanjiang Lai,et al.  Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Hervé Glotin,et al.  IRIM at TRECVID 2014: Semantic Indexing and Instance Search , 2014, TRECVID.

[16]  Pijush Kanti Bhattacharjee Integrating Pixel Cluster Indexing, Histogram Intersection And Discrete Wavelet Transform Methods For Color Images Content Based Image Retrieval System , 2010 .

[17]  Milan Petkovic,et al.  Content-Based Video Retrieval , 2004, The Springer International Series in Engineering and Computer Science.

[18]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[19]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[22]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[23]  Vaidehi K. Bante,et al.  A SURVEY ON TEXT BASED VIDEO RETRIEVAL USING SEMANTIC AND VISUAL APPROACH VAIDEHI , 2015 .

[24]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[25]  K. S. Venkatesh,et al.  Perceptual synoptic view-based video retrieval using metadata , 2017, Signal Image Video Process..

[26]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[27]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[28]  Nicu Sebe,et al.  Quantization-based hashing: a general framework for scalable image and video retrieval , 2018, Pattern Recognit..

[29]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[30]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[31]  Bernd Girod,et al.  Stanford I2V: a news video dataset for query-by-image experiments , 2015, MMSys.

[32]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[33]  Yannis Avrithis,et al.  Efficient Diffusion on Region Manifolds: Recovering Small Objects with Compact CNN Representations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[35]  Ondrej Chum,et al.  CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples , 2016, ECCV.

[36]  Shin'ichi Satoh,et al.  Multi-image aggregation for better visual object retrieval , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[37]  Ronan Sicre,et al.  Particular object retrieval with integral max-pooling of CNN activations , 2015, ICLR.

[38]  B. B. Meshram,et al.  Content based video retrieval , 2012, ArXiv.

[39]  Lei Zhang,et al.  Bit-Scalable Deep Hashing With Regularized Similarity Learning for Image Retrieval and Person Re-Identification , 2015, IEEE Transactions on Image Processing.