A Novel Visual-Vocabulary-Translator-Based Cross-Domain Image Matching

Cross-domain image matching, which investigates the problem of searching images across different visual domains such as photo, sketch, or painting, has attracted intensive attention in computer vision due to its widespread application. Unlike intra-domain matching, cross-domain images appear quite different in various characteristics. This leads to the failure of most existing approaches. However, the great difference between cross-domain images is just like the huge gap between English and Chinese. The two languages are linked up by an English-Chinese translation dictionary. Inspired by this idea, in this paper, we purpose a novel visual vocabulary translator for cross-domain image matching. This translator consists of two main modules: one is a pair of vocabulary trees which can be regarded as the codebooks in their respective fields, whereas the other is the index file based on cross-domain image pair. Through such a translator, a feature from one visual domain can be translated into another. The proposed algorithm is extensively evaluated on two kinds of cross-domain matching tasks, i.e., photo-to-sketch matching and photo-to-painting matching. Experimental results demonstrate that the effectiveness and efficiency of the visual vocabulary translator. And by employing this translator, the proposed algorithm achieves satisfactory performance in different matching systems. Furthermore, our work shows great potential in multiple visual domains.

[1]  Marc Alexa,et al.  Sketch-Based Image Retrieval: Benchmark and Bag-of-Features Descriptors , 2011, IEEE Transactions on Visualization and Computer Graphics.

[2]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  Roland Siegwart,et al.  Towards real-time multi-sensor information retrieval in Cloud Robotic System , 2012, 2012 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI).

[4]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Anil Balaji Gonde,et al.  Complex wavelet transform with vocabulary tree for content based image retrieval , 2010, ICVGIP '10.

[8]  Xing Xu,et al.  Coupled dictionary learning and feature mapping for cross-modal retrieval , 2015, 2015 IEEE International Conference on Multimedia and Expo (ICME).

[9]  Yannis Avrithis,et al.  Image Search with Selective Match Kernels: Aggregation Across Single and Multiple Images , 2016, International Journal of Computer Vision.

[10]  B. V. K. Vijaya Kumar,et al.  A multi-sensor fusion system for moving object detection and tracking in urban driving environments , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Wei Wang,et al.  Multi-modal Subspace Learning with Joint Graph Regularization for Cross-Modal Retrieval , 2013, 2013 2nd IAPR Asian Conference on Pattern Recognition.

[12]  Alexei A. Efros,et al.  Data-driven visual similarity for cross-domain image matching , 2011, ACM Trans. Graph..

[13]  Santanu Chaudhury,et al.  Camera-based document image matching using multi-feature probabilistic information fusion , 2015, Pattern Recognit. Lett..

[14]  Yu-Chiang Frank Wang,et al.  A discriminative domain adaptation model for cross-domain image classification , 2013, 2013 IEEE International Conference on Image Processing.

[15]  Shin'ichi Satoh,et al.  Efficient Cross-Domain Image Retrieval by Multi-Level Matching and Spatial Verification for Structural Similarity , 2014, ACM Multimedia.

[16]  Cheng Jin,et al.  Sketch-Based Image Retrieval with a Novel BoVW Representation , 2016, MMM.

[17]  Xiaoyang Yu,et al.  Image retrieval by information fusion based on scalable vocabulary tree and robust Hausdorff distance , 2017, EURASIP J. Adv. Signal Process..

[18]  Ming Yang,et al.  Contextual weighting for vocabulary tree based image retrieval , 2011, 2011 International Conference on Computer Vision.

[19]  Zengwei Zheng,et al.  Thermal-to-Visible Face Alignment on Edge Map , 2017, IEEE Access.

[20]  Wei Liu,et al.  Discriminative Dictionary Learning With Common Label Alignment for Cross-Modal Retrieval , 2016, IEEE Transactions on Multimedia.

[21]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Yannis Avrithis,et al.  Approximate Gaussian Mixtures for Large Scale Vocabularies , 2012, ECCV.

[23]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[24]  Xian-Sheng Hua,et al.  Large-scale robust visual codebook construction , 2010, ACM Multimedia.

[25]  Lei Shu,et al.  A Scheme on Indoor Tracking of Ship Dynamic Positioning Based on Distributed Multi-Sensor Data Fusion , 2017, IEEE Access.

[26]  Marc Alexa,et al.  A descriptor for large scale image retrieval based on sketched feature lines , 2009, SBIM '09.

[27]  Xiangwei Kong,et al.  BHoG: binary descriptor for sketch-based image retrieval , 2014, Multimedia Systems.

[28]  Yuting Zhang,et al.  Sketch-Based Image Retrieval by Salient Contour Reinforcement , 2016, IEEE Transactions on Multimedia.