Convolutional neural networks for computer vision-based detection and recognition of dumpsters

In this paper, we propose a twofold methodology for visual detection and recognition of different types of city dumpsters, with minimal human labeling of the image data set. Firstly, we carry out transfer learning by using Google Inception-v3 convolutional neural network, which is retrained with only a small subset of labeled images out of the whole data set. This first classifier is then improved with a semi-supervised learning based on retraining for two more rounds, each one increasing the number of labeled images but without human supervision. We compare our approach against both to a baseline case, with no incremental retraining, and the best case, assuming we had a fully labeled data set. We use a data set of 27,624 labeled images of dumpsters provided by Ecoembes, a Spanish nonprofit organization that cares for the environment through recycling and the eco-design of packaging in Spain. Such a data set presents a number of challenges. As in other outdoor visual tasks, there are occluding objects such as vehicles, pedestrians and street furniture, as well as other dumpsters whenever they are placed in groups. In addition, dumpsters have different degrees of deterioration which may affect their shape and color. Finally, 35% of the images are classified according to the capacity of the container, which contains a feature which is hard to assess in a snapshot. Since the data set is fully labeled, we can compare our approach both against a baseline case, doing only the transfer learning using a minimal set of labeled images, and against the best case, using all the labels. The experiments show that the proposed system provides an accuracy of 88%, whereas in the best case it is 93%. In other words, the method proposed attains 94% of the best performance.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Geoffrey Zweig,et al.  From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Christoph H. Lampert,et al.  Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation , 2016, ECCV.

[4]  Wenju Liu,et al.  Improving deep neural networks by using sparse dropout strategy , 2014, 2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP).

[5]  Arjan Kuijper,et al.  Beyond Group: Multiple Person Tracking via Minimal Topology-Energy-Variation , 2017, IEEE Transactions on Image Processing.

[6]  Qiang Yang,et al.  Transferring Naive Bayes Classifiers for Text Classification , 2007, AAAI.

[7]  Jian Wang,et al.  Municipal solid waste classification using microwave nondestructive testing technique , 2016, 2016 13th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI).

[8]  Sehyun Park,et al.  IoT-Based Smart Garbage System for Efficient Food Waste Management , 2014, TheScientificWorldJournal.

[9]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[10]  Jonathan Krause,et al.  Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  S. Sudha,et al.  An automatic classification method for environment: Friendly waste segregation using deep learning , 2016, 2016 IEEE Technological Innovations in ICT for Agriculture and Rural Development (TIAR).

[12]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[13]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Trevor Darrell,et al.  Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.

[15]  Ronan Collobert,et al.  Phrase-based Image Captioning , 2015, ICML.

[16]  L. Venkata Subramaniam,et al.  Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments , 2013, 2013 IEEE International Symposium on Multimedia.

[17]  B. L. Juan Carlos,et al.  Automatic Waste Classification using Computer Vision as an Application in Colombian High Schools , 2015 .

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Alan L. Yuille,et al.  Semi-Supervised Sparse Representation Based Classification for Face Recognition With Insufficient Labeled Samples , 2016, IEEE Transactions on Image Processing.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Rong Zhang,et al.  A New Data Selection Principle for Semi-Supervised Incremental Learning , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[22]  Yi Li,et al.  DeepTrack: Learning Discriminative Feature Representations by Convolutional Neural Networks for Visual Tracking , 2014, BMVC.

[23]  Biprodip Pal,et al.  Gaussian mixture based semi supervised boosting for imbalanced data classification , 2016, 2016 2nd International Conference on Electrical, Computer & Telecommunication Engineering (ICECTE).

[24]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[25]  Zhi-Hua Zhou,et al.  Towards Making Unlabeled Data Never Hurt , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Sahar Idwan,et al.  Smart Solutions for Smart Cities: Using Wireless Sensor Network for Smart Dumpster Management , 2016, 2016 International Conference on Collaboration Technologies and Systems (CTS).

[28]  Koray Kavukcuoglu,et al.  Visual Attention , 2020, Computational Models for Cognitive Vision.

[29]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[30]  Peter Stone,et al.  Accelerating Search with Transferred Heuristics , 2007 .

[31]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).