A deep one-shot network for query-based logo retrieval

Abstract Logo detection in real-world scene images is an important problem with applications in advertisement and marketing. Existing general-purpose object detection methods require large training data with annotations for every logo class. These methods do not satisfy the incremental demand of logo classes necessary for practical deployment since it is practically impossible to have such annotated data for new unseen logo. In this work, we develop an easy-to-implement query-based logo detection and localization system by employing a one-shot learning technique using off the shelf neural network components. Given an image of a query logo, our model searches for logo within a given target image and predicts the possible location of the logo by estimating a binary segmentation mask. The proposed model consists of a conditional branch and a segmentation branch. The former gives a conditional latent representation of the given query logo which is combined with feature maps of the segmentation branch at multiple scales in order to obtain the matching location of the query logo in a target image. Feature matching between the latent query representation and multi-scale feature maps of segmentation branch using simple concatenation operation followed by 1 × 1 convolution layer makes our model scale-invariant. Despite its simplicity, our query-based logo retrieval framework achieved superior performance in FlickrLogos-32 and TopLogos-10 dataset over different existing baseline methods.

[1]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[2]  Jihad El-Sana,et al.  Shape recognition and pose estimation for mobile augmented reality , 2009, ISMAR.

[3]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  John P. Eakins,et al.  Shape Feature Matching for Trademark Image Retrieval , 2003, CIVR.

[5]  Shaogang Gong,et al.  Open Logo Detection Challenge , 2018, BMVC.

[6]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[7]  Kavita Bala,et al.  Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Bharath Hariharan,et al.  Low-Shot Visual Recognition by Shrinking and Hallucinating Features , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Eric P. Xing,et al.  Few-Shot Semantic Segmentation with Prototype Learning , 2018, BMVC.

[10]  Arnold W. M. Smeulders,et al.  Color Based Object Recognition , 1997, ICIAP.

[11]  Lianwen Jin,et al.  A Multi-Object Rectified Attention Network for Scene Text Recognition , 2019, Pattern Recognit..

[12]  Xing Xie,et al.  Spatial pyramid mining for logo detection in natural scenes , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[13]  Eleftherios Kayafas,et al.  Vehicle Logo Recognition Using a SIFT-Based Enhanced Matching Scheme , 2010, IEEE Transactions on Intelligent Transportation Systems.

[14]  Cordelia Schmid,et al.  Correlation-based burstiness for logo retrieval , 2012, ACM Multimedia.

[15]  Whoi-Yul Kim,et al.  Content-based trademark retrieval system using a visually salient feature , 1998, Image Vis. Comput..

[16]  Byron Boots,et al.  One-Shot Learning for Semantic Segmentation , 2017, BMVC.

[17]  Shaogang Gong,et al.  Deep Learning Logo Detection with Data Expansion by Synthesising Context , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18]  Josep Lladós,et al.  Efficient logo retrieval through hashing shape context descriptors , 2010, DAS '10.

[19]  Christian Eggert,et al.  On the Benefit of Synthetic Data for Company Logo Detection , 2015, ACM Multimedia.

[20]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[21]  Yehezkel Lamdan,et al.  Object recognition by affine invariant matching , 2011, Proceedings CVPR '88: The Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[23]  Rainer Lienhart,et al.  Bundle min-hashing for logo recognition , 2013, ICMR '13.

[24]  Olivier Buisson,et al.  Logo retrieval with a contrario visual query expansion , 2009, ACM Multimedia.

[25]  Rainer Lienhart,et al.  Scalable logo recognition in real-world images , 2011, ICMR.

[26]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[27]  Yuning Jiang,et al.  Interactive visual object search through mutual information maximization , 2010, ACM Multimedia.

[28]  Albert Gordo,et al.  Deep Image Retrieval: Learning Global Representations for Image Search , 2016, ECCV.

[29]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Alireza Alaei,et al.  Logo and seal based administrative document image retrieval: A survey , 2016, Comput. Sci. Rev..

[31]  Forrest N. Iandola,et al.  DeepLogo: Hitting Logo Recognition with the Deep Neural Network Hammer , 2015, ArXiv.

[32]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Radu Dogaru,et al.  Logo localization and recognition in natural images using homographic class graphs , 2015, Machine Vision and Applications.

[35]  Daniela Hall,et al.  Brand identification using Gaussian derivative histograms , 2003, Machine Vision and Applications.

[36]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Hanqing Lu,et al.  Effective logo retrieval with adaptive local feature selection , 2010, ACM Multimedia.

[38]  Partha Pratim Roy,et al.  Texture synthesis guided deep hashing for texture image retrieval , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[39]  Hao Wang,et al.  Multi-scale Location-Aware Kernel Representation for Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Bohyung Han,et al.  Image Question Answering Using Convolutional Neural Network with Dynamic Parameter Prediction , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[42]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Bernardete Ribeiro,et al.  Automatic graphic logo detection via Fast Region-based Convolutional Networks , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[44]  Luc Van Gool,et al.  One-Shot Video Object Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46]  Partha Pratim Roy,et al.  Script Identification in Natural Scene Image and Video Frame using Attention based Convolutional-LSTM Network , 2018, Pattern Recognit..

[47]  Yunchao Wei,et al.  Perceptual Generative Adversarial Networks for Small Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Partha Pratim Roy,et al.  Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Lianwen Jin,et al.  Curved scene text detection via transverse and longitudinal sequence connection , 2019, Pattern Recognit..

[50]  Luca Bertinetto,et al.  Learning feed-forward one-shot learners , 2016, NIPS.

[51]  Edward M. Riseman,et al.  Word spotting: a new approach to indexing handwriting , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[53]  Shaogang Gong,et al.  Scalable Deep Learning Logo Detection , 2018, ArXiv.

[54]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[55]  Fuchun Sun,et al.  HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Alan Hanjalic,et al.  Logo recognition in video stills by string matching , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[57]  Hanqing Lu,et al.  Logo Retrieval with Latent Semantic Analysis , 2006 .

[58]  Corneliu Florea,et al.  Elliptical ASIFT Agglomeration in Class Prototype for Logo Detection , 2015, BMVC.

[59]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[60]  Yi Yang,et al.  SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation , 2018, IEEE Transactions on Cybernetics.

[61]  A. Aydin Alatan,et al.  Joint Utilization of Appearance, Geometry and Chance for Scene Logo Retrieval , 2011, Comput. J..

[62]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[63]  Peng-Yeng Yin,et al.  Content-based retrieval from trademark databases , 2002, Pattern Recognit. Lett..

[64]  M. Fatih Demirci,et al.  Layout indexing of trademark images , 2007, CIVR '07.

[65]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[66]  Raimondo Schettini,et al.  Deep Learning for Logo Recognition , 2017, Neurocomputing.

[67]  Raimondo Schettini,et al.  Logo Recognition Using CNN Features , 2015, ICIAP.

[68]  Alberto Del Bimbo,et al.  Trademark matching and retrieval in sports video databases , 2007, MIR '07.

[69]  Yue Gao,et al.  Brand Data Gathering From Live Social Media Streams , 2014, ICMR.

[70]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).