FoodLogoDet-1500: A Dataset for Large-Scale Food Logo Detection via Multi-Scale Feature Decoupling Network

Food logo detection plays an important role in the multimedia for its wide real-world applications, such as food recommendation of the self-service shop and infringement detection on e-commerce platforms. A large-scale food logo dataset is urgently needed for developing advanced food logo detection algorithms. However, there are no available food logo datasets with food brand information. To support efforts towards food logo detection, we introduce the dataset FoodLogoDet-1500, a new large-scale publicly available food logo dataset, which has 1,500 categories, about 100,000 images and about 150,000 manually annotated food logo objects. We describe the collection and annotation process of FoodLogoDet-1500, analyze its scale and diversity, and compare it with other logo datasets. To the best of our knowledge, FoodLogoDet-1500 is the first largest publicly available high-quality dataset for food logo detection. The challenge of food logo detection lies in the large-scale categories and similarities between food logo categories. For that, we propose a novel food logo detection method Multi-scale Feature Decoupling Network (MFDNet), which decouples classification and regression into two branches and focuses on the classification branch to solve the problem of distinguishing multiple food logo categories. Specifically, we introduce the feature offset module, which utilizes the deformation-learning for optimal classification offset and can effectively obtain the most representative features of classification in detection. In addition, we adopt a balanced feature pyramid in MFDNet, which pays attention to global information, balances the multi-scale feature maps, and enhances feature extraction capability. Comprehensive experiments on FoodLogoDet-1500 and other two popular benchmark logo datasets demonstrate the effectiveness of the proposed method. The code and FoodLogoDet-1500 can be found at https://github.com/hq03/FoodLogoDet-1500-Dataset.

[1]  Yuanjie Zheng,et al.  Logo-2K+: A Large-Scale Logo Dataset for Scalable Logo Classification , 2019, AAAI.

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Qiang Wu,et al.  LOGO-Net: Large-scale Deep Logo Detection and Brand Recognition with Deep Region-based Convolutional Networks , 2015, ArXiv.

[4]  Kai Chen,et al.  MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.

[5]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Yue Gao,et al.  Brand Data Gathering From Live Social Media Streams , 2014, ICMR.

[7]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[8]  Yuning Jiang,et al.  UnitBox: An Advanced Object Detection Network , 2016, ACM Multimedia.

[9]  Akbar Karimi,et al.  A Novel Region of Interest Extraction Layer for Instance Segmentation , 2021, 2020 25th International Conference on Pattern Recognition (ICPR).

[10]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[11]  Hao Chen,et al.  FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Abhinav Gupta,et al.  Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Yuan He,et al.  The Open Brands Dataset: Unified Brand Detection and Recognition at Scale , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Jun Li,et al.  Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection , 2020, NeurIPS.

[15]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[16]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[17]  Yi Jiang,et al.  Sparse R-CNN: End-to-End Object Detection with Learnable Proposals , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jan P. Allebach,et al.  Logo detection and recognition with synthetic images , 2018, IMAWM.

[19]  Silvio Savarese,et al.  Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Cordelia Schmid,et al.  Correlation-based burstiness for logo retrieval , 2012, ACM Multimedia.

[21]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[22]  Qi Tian,et al.  Fast and accurate near-duplicate image search with affinity propagation on the ImageWeb , 2014, Comput. Vis. Image Underst..

[23]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Rainer Lienhart,et al.  Bundle min-hashing for logo recognition , 2013, ICMR '13.

[26]  Nicolas Usunier,et al.  End-to-End Object Detection with Transformers , 2020, ECCV.

[27]  Yuanjie Zheng,et al.  LogoDet-3K: A Large-scale Image Dataset for Logo Detection , 2020, ACM Trans. Multim. Comput. Commun. Appl..

[28]  Marios Savvides,et al.  Feature Selective Anchor-Free Module for Single-Shot Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Olivier Buisson,et al.  Logo retrieval with a contrario visual query expansion , 2009, ACM Multimedia.

[30]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, ECCV.

[31]  Xilin Chen,et al.  Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training , 2020, ECCV.

[32]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[33]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[34]  Olivier Buisson,et al.  Scalable mining of small visual objects , 2012, ACM Multimedia.

[35]  Yannis Avrithis,et al.  Scalable triangulation-based logo recognition , 2011, ICMR.

[36]  Jürgen Beyerer,et al.  Open Set Logo Detection and Retrieval , 2017, VISIGRAPP.

[37]  Larry S. Davis,et al.  Soft-NMS — Improving Object Detection with One Line of Code , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38]  Alberto Del Bimbo,et al.  Context-Dependent Logo Matching and Recognition , 2013, IEEE Transactions on Image Processing.

[39]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[40]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[41]  Rainer Lienhart,et al.  Scalable logo recognition in real-world images , 2011, ICMR.

[42]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[43]  Guanglu Song,et al.  Revisiting the Sibling Head in Object Detector , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Shaogang Gong,et al.  Deep Learning Logo Detection with Data Expansion by Synthesising Context , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[45]  Yuning Jiang,et al.  FoveaBox: Beyound Anchor-Based Object Detection , 2019, IEEE Transactions on Image Processing.

[46]  Christian Eggert,et al.  On the Benefit of Synthetic Data for Company Logo Detection , 2015, ACM Multimedia.

[47]  Shifeng Zhang,et al.  Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Raimondo Schettini,et al.  Deep Learning for Logo Recognition , 2017, Neurocomputing.

[49]  Huajun Feng,et al.  Libra R-CNN: Towards Balanced Learning for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Chen Change Loy,et al.  Side-Aware Boundary Localization for More Precise Object Detection , 2019, ECCV.

[51]  Bernardete Ribeiro,et al.  Automatic graphic logo detection via Fast Region-based Convolutional Networks , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[52]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Chengcui Zhang,et al.  Mutual Enhancement for Detection of Multiple Logos in Sports Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[54]  Shaogang Gong,et al.  Open Logo Detection Challenge , 2018, BMVC.

[55]  Yuning Jiang,et al.  Acquisition of Localization Confidence for Accurate Object Detection , 2018, ECCV.

[56]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[57]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[59]  Shu Liu,et al.  Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[60]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[62]  Ramesh C. Jain,et al.  A Survey on Food Computing , 2018, ACM Comput. Surv..

[63]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[64]  Forrest N. Iandola,et al.  DeepLogo: Hitting Logo Recognition with the Deep Neural Network Hammer , 2015, ArXiv.

[65]  Lu Yuan,et al.  Rethinking Classification and Localization for Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[67]  Shaogang Gong,et al.  WebLogo-2M: Scalable Logo Detection by Deep Learning from the Web , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[68]  István Fehérvári,et al.  Scalable Logo Recognition Using Proxies , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[69]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.