Simultaneous food localization and recognition

The development of automatic nutrition diaries, which would allow to keep track objectively of everything we eat, could enable a whole new world of possibilities for people concerned about their nutrition patterns. With this purpose, in this paper we propose the first method for simultaneous food localization and recognition. Our method is based on two main steps, which consist in, first, produce a food activation map on the input image (i.e. heat map of probabilities) for generating bounding boxes proposals and, second, recognize each of the food types or food-related objects present in each bounding box. We demonstrate that our proposal, compared to the most similar problem nowadays - object localization, is able to obtain high precision and reasonable recall levels with only a few bounding boxes. Furthermore, we show that it is applicable to both conventional and egocentric images.

[1]  Matthieu Guillaumin,et al.  Food-101 - Mining Discriminative Components with Random Forests , 2014, ECCV.

[2]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Keiji Yanai,et al.  Food image recognition using deep convolutional network with pre-training and fine-tuning , 2015, 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[4]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[5]  Charles X. Ling,et al.  Adapting New Categories for Food Recognition with Deep Representation , 2015, 2015 IEEE International Conference on Data Mining Workshop (ICDMW).

[6]  Peter I. Corke,et al.  Modelling local deep convolutional neural network features to improve fine-grained image classification , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[7]  Sergio Guadarrama,et al.  Im2Calories: Towards an Automated Mobile Vision Food Diary , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Keiji Yanai,et al.  Real-Time Mobile Food Recognition System , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[9]  Gregory D. Abowd,et al.  Leveraging Context to Support Automated Food Recognition in Restaurants , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[10]  Petia Radeva,et al.  Toward Storytelling From Visual Lifelogging: An Overview , 2015, IEEE Transactions on Human-Machine Systems.

[11]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[12]  Makoto Ogawa,et al.  Food Detection and Recognition Using Convolutional Neural Network , 2014, ACM Multimedia.

[13]  Marios Anthimopoulos,et al.  Segmentation and recognition of multi-food meal images for carbohydrate counting , 2013, 13th IEEE International Conference on BioInformatics and BioEngineering.

[14]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[18]  S. Heymsfield,et al.  Discrepancy between self-reported and actual caloric intake and exercise in obese subjects. , 1992, The New England journal of medicine.

[19]  David S. Ebert,et al.  Segmentation assisted food classification for dietary assessment , 2011, Electronic Imaging.

[20]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[21]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[22]  Keiji Yanai,et al.  FoodCam-256: A Large-scale Real-time Mobile Food RecognitionSystem employing High-Dimensional Features and Compression of Classifier Weights , 2014, ACM Multimedia.

[23]  Keiji Yanai,et al.  Automatic Expansion of a Food Image Dataset Leveraging Existing Categories with Domain Adaptation , 2014, ECCV Workshops.

[24]  Thomas Deselaers,et al.  What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Kiyoharu Aizawa,et al.  Highly Accurate Food/Non-Food Image Classification Based on a Deep Convolutional Neural Network , 2015, ICIAP Workshops.

[26]  Kiyoharu Aizawa,et al.  Food Balance Estimation by Using Personal Dietary Tendencies in a Multimedia Food Log , 2013, IEEE Transactions on Multimedia.