Image retrieval and classification on deep convolutional SparkNet

Image retrieval and classification are hot topics in computer vision and have attracted great attention nowadays with the emergence of large-scale data. We propose a new scheme to use both deep learning models and large-scale computing platform and jointly learn powerful feature representations in image classification and retrieval. We achieve a superior performance on the ImageNet dataset, where the framework is easy to be embedded for daily user experience. First we conduct the classification task using deep convolutional neural networks with several novel techniques, including batch normalization and multi-crop testing to obtain a better performance. Then we transfer the network's knowledge to image retrieval task by comparing the feature codebook of the query image with those feature database extracted from the deep model. Such a search pipeline is implemented in a MapReduce framework on the Spark platform, which is suitable for large-scale and real-time data processing. At last, the system outputs to users some textual information of the predicted object searching from Internet as well as similar images from the retrieval stage, making our work a real application.

[1]  Huchuan Lu,et al.  LCNN: Low-level Feature Embedded CNN for Salient Object Detection , 2015, ArXiv.

[2]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[4]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[5]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[6]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[7]  Tieniu Tan,et al.  Deep semantic ranking based hashing for multi-label image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[10]  Matti Pietikäinen,et al.  Local Binary Patterns for Still Images , 2011 .

[11]  Xiaogang Wang,et al.  Multi-Bias Non-linear Activation in Deep Neural Networks , 2016, ICML.

[12]  GhemawatSanjay,et al.  The Google file system , 2003 .

[13]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[14]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[15]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[17]  Matti Pietikäinen,et al.  Computer Vision Using Local Binary Patterns , 2011, Computational Imaging and Vision.

[18]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[19]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.