Scalable logo recognition based on compact sparse dictionary for mobile devices

In this paper, we present a novel scalable logo recognition system which can recognize a large number of logo categories locally on mobile devices. The system is unsupervised without any supervised training procedure, and very time efficient at low memory cost. It is also robust against challenging conditions such as noise addition, different image scale, rotation, etc. To achieve this goal, we propose an efficient segmental quantization approach for generation of large visual words over one million size with a very compact vocabulary. The vocabulary consists of two small dictionaries learned through sparse non-negative matrix factorization (NMF) of local SIFT descriptors. With an inverted index structure built through the large visual words, query images containing logos can be recognized through efficient retrieval of K-nearest neighbors (K-NN) of logo instances in the dataset. Our vocabulary size is very small, only one thousandth of that of traditional Approximate K-Means (AKM) method, which is of great importance for mobile devices with limited memory. Furthermore, based on the compact dictionary, we present a promising verification way of filtering false positives via sparse reconstruction of SIFT descriptors with a very few number of sparse codes due to the sparsity's property of lowest reconstruction error. Experiments on our dataset with 400 logo classes show that our system is very efficient and effective.

[1]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[2]  Rainer Lienhart,et al.  Bundle min-hashing for logo recognition , 2013, ICMR '13.

[3]  Olivier Buisson,et al.  Logo retrieval with a contrario visual query expansion , 2009, ACM Multimedia.

[4]  Cordelia Schmid,et al.  Correlation-based burstiness for logo retrieval , 2012, ACM Multimedia.

[5]  Anastasios L. Kesidis,et al.  Logo and Trademark Recognition , 2014, Handbook of Document Image Processing and Recognition.

[6]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Changchang Wu,et al.  SiftGPU : A GPU Implementation of Scale Invariant Feature Transform (SIFT) , 2007 .

[8]  Hanan Samet,et al.  Content-based image retrieval using Fourier descriptors on a logo database , 2002, Object recognition supported by user interaction for service robots.

[9]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Alberto Del Bimbo,et al.  Trademark matching and retrieval in sports video databases , 2007, MIR '07.

[12]  Tom Drummond,et al.  Robust feature matching in 2.3µs , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[13]  Xing Xie,et al.  Spatial pyramid mining for logo detection in natural scenes , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[14]  Aniruddha Sinha,et al.  Recognition of trademarks from sports videos for channel hyperlinking in consumer end , 2009, 2009 IEEE 13th International Symposium on Consumer Electronics.

[15]  Yannis Avrithis,et al.  Approximate Gaussian Mixtures for Large Scale Vocabularies , 2012, ECCV.

[16]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[18]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[19]  Ehud Rivlin,et al.  Logo recognition using geometric invariants , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[20]  Rainer Lienhart,et al.  Scalable logo recognition in real-world images , 2011, ICMR.

[21]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[22]  Fabien A. P. Petitcolas,et al.  Watermarking schemes evaluation , 2000, IEEE Signal Process. Mag..

[23]  Yannis Avrithis,et al.  Scalable triangulation-based logo recognition , 2011, ICMR.

[24]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Sheng Tang,et al.  Logo detection based on spatial-spectral saliency and partial spatial context , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[26]  Jiri Matas,et al.  Learning Vocabularies over a Fine Quantization , 2013, International Journal of Computer Vision.

[27]  Xian-Sheng Hua,et al.  Large-scale robust visual codebook construction , 2010, ACM Multimedia.

[28]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Bernd Girod,et al.  Mobile Visual Search , 2011, IEEE Signal Processing Magazine.

[30]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[31]  Tien Ba Dinh,et al.  Local Logo Recognition System for Mobile Devices , 2013, ICCSA.