Deep Residual Net Based Compact Feature Representation for Image Retrieval

Deep learning technology has been introduced into many multimedia processing tasks, including multimedia retrieval. In this paper, we propose a deep residual net (ResNet) based compact feature representation improve the content-based image retrieval (CBIR) performance. The proposed method integrates ResNet and hashing networks to convert the raw images into binary codes. The binary codes of images in query set and that of the database are compared using Hamming distance for retrieval. Comprehensive experiments are executed on three public databases. The results show that the proposed method outperforms state-of-the-art methods. Furthermore, the impact of the deep convolutional network (DCNN)’s depth on the performance is investigated.

[1]  Kien A. Hua,et al.  Learning Label Preserving Binary Codes for Multimedia Retrieval , 2018, ACM Trans. Multim. Comput. Commun. Appl..

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Lei Huang,et al.  User Behavior Analysis and Video Popularity Prediction on a Large-Scale VoD System , 2018, ACM Trans. Multim. Comput. Commun. Appl..

[4]  Laurent Amsaleg,et al.  Supervised Multi-scale Locality Sensitive Hashing , 2015, ICMR.

[5]  Yuxin Peng,et al.  SSDH: Semi-Supervised Deep Hashing for Large Scale Image Retrieval , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  L. Deng,et al.  The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web] , 2012, IEEE Signal Processing Magazine.

[9]  Falk Scholer,et al.  User performance versus precision measures for simple search tasks , 2006, SIGIR.

[10]  Yang Wang,et al.  Salient Object Segmentation via Effective Integration of Saliency and Objectness , 2017, IEEE Transactions on Multimedia.

[11]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  David Suter,et al.  Fast Supervised Hashing with Decision Trees for High-Dimensional Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Meng Wang,et al.  Multi-View Object Retrieval via Multi-Scale Topic Models , 2016, IEEE Transactions on Image Processing.

[14]  Roger Zimmermann,et al.  Flickr Circles: Aesthetic Tendency Discovery by Multi-View Regularized Topic Modeling , 2016, IEEE Transactions on Multimedia.

[15]  Salima Benbernou,et al.  A survey on service quality description , 2013, CSUR.

[16]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[17]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[18]  Wanliang Wang,et al.  Iterative Re-Constrained Group Sparse Face Recognition With Adaptive Weights Learning , 2017, IEEE Transactions on Image Processing.

[19]  Qi Tian,et al.  Multimedia search reranking: A literature survey , 2014, CSUR.

[20]  Ying Liu,et al.  A survey of content-based image retrieval with high-level semantics , 2007, Pattern Recognit..

[21]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Ling Huang,et al.  Optimization of deep convolutional neural network for large scale image retrieval , 2018, Neurocomputing.

[23]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[24]  Jiwen Lu,et al.  Nonlinear Discrete Hashing , 2017, IEEE Transactions on Multimedia.

[25]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26]  TianQi,et al.  Multimedia search reranking , 2014 .

[27]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Luming Zhang,et al.  Unified Photo Enhancement by Discovering Aesthetic Communities From Flickr , 2016, IEEE Transactions on Image Processing.

[29]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[30]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[31]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).