Multi-task learning of dish detection and calorie estimation

In recent years, a rise in healthy eating has led to various food management applications, which have image recognition to automatically record meals. However, most image recognition functions in existing applications are not directly useful for multiple-dish food photos and cannot automatically estimate food calories. Meanwhile, methodologies on image recognition have advanced greatly because of the advent of Convolutional Neural Network, which has improved accuracies of various kinds of image recognition tasks, such as classification and object detection. Therefore, we propose CNN-based food calorie estimation for multiple-dish food photos. Our method estimates food calories while simultaneously detecting dishes by multi-task learning of food calorie estimation and food dish detection with a single CNN. It is expected to achieve high speed and save memory by simultaneous estimation in a single network. Currently, there is no dataset of multiple-dish food photos annotated with both bounding boxes and food calories, so in this work, we use two types of datasets alternately for training a single CNN. For the two types of datasets, we use multiple-dish food photos with bounding-boxes attached and single-dish food photos with food calories. Our results show that our multi-task method achieved higher speed and a smaller network size than a sequential model of food detection and food calorie estimation.

[1]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Marios Anthimopoulos,et al.  Food Image Segmentation for Dietary Assessment , 2016, MADiMa @ ACM Multimedia.

[4]  Keiji Yanai,et al.  Simultaneous estimation of food categories and calories with multi-task CNN , 2017, 2017 Fifteenth IAPR International Conference on Machine Vision Applications (MVA).

[5]  Kiyoharu Aizawa,et al.  Image-based Calorie Content Estimation for Dietary Assessment , 2011, 2011 IEEE International Symposium on Multimedia.

[6]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[7]  Wataru Shimoda,et al.  Foodness Proposal for Multiple Food Detection by Training of Single Food Images , 2016, MADiMa @ ACM Multimedia.

[8]  Sergio Guadarrama,et al.  Im2Calories: Towards an Automated Mobile Vision Food Diary , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Wataru Shimoda,et al.  CNN-Based Food Image Segmentation Without Pixel-Wise Annotation , 2015, ICIAP Workshops.

[11]  Keiji Yanai,et al.  Recognition of Multiple-Food Images by Detecting Candidate Regions , 2012, 2012 IEEE International Conference on Multimedia and Expo.