Saliency-Aware Class-Agnostic Food Image Segmentation

Advances in image-based dietary assessment methods have allowed nutrition professionals and researchers to improve the accuracy of dietary assessment, where images of food consumed are captured using smartphones or wearable devices. These images are then analyzed using computer vision methods to estimate energy and nutrition content of the foods. Food image segmentation, which determines the regions in an image where foods are located, plays an important role in this process. Current methods are data dependent and thus cannot generalize well for different food types. To address this problem, we propose a class-agnostic food image segmentation method. Our method uses a pair of eating scene images, one before starting eating and one after eating is completed. Using information from both the before and after eating images, we can segment food images by finding the salient missing objects without any prior information about the food class. We model a paradigm of top-down saliency that guides the attention of the human visual system based on a task to find the salient missing objects in a pair of images. Our method is validated on food images collected from a dietary study that showed promising results.

[1]  Mingui Sun,et al.  Methodology for Objective, Passive, Image- and Sensor-based Assessment of Dietary Intake, Meal-timing, and Food-related Activity in Ghana and Kenya (P13-028-19). , 2019, Current developments in nutrition.

[2]  Lorenzo Rosasco,et al.  Unsupervised learning of invariant representations , 2016, Theor. Comput. Sci..

[3]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[4]  Laurent Itti,et al.  Beyond bottom-up: Incorporating task-dependent influences into a computational model of spatial attention , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Kiyoharu Aizawa,et al.  FoodLog: Multimedia Tool for Healthcare Applications , 2015, IEEE MultiMedia.

[6]  Guna Seetharaman,et al.  Efficient Change Detection for Very Large Motion Blurred Images , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[7]  Kate Saenko,et al.  Top-Down Visual Saliency Guided by Captions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Edward J. Delp,et al.  The use of co-occurrence patterns in single image based food portion estimation , 2017, 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[9]  Edward J. Delp,et al.  Single-View Food Portion Estimation: Learning Image-to-Energy Mappings Using Generative Adversarial Networks , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[10]  Keiji Yanai,et al.  FoodCam: A real-time food recognition system on a smartphone , 2015, Multimedia Tools and Applications.

[11]  Huchuan Lu,et al.  Learning Uncertain Convolutional Features for Accurate Saliency Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Marios Anthimopoulos,et al.  Food Image Segmentation for Dietary Assessment , 2016, MADiMa @ ACM Multimedia.

[13]  Shervin Shirmohammadi,et al.  Measuring Calorie and Nutrition From Food Image , 2014, IEEE Transactions on Instrumentation and Measurement.

[14]  Kyungwon Oh,et al.  Dietary assessment methods in epidemiologic studies , 2014, Epidemiology and health.

[15]  Edward J. Delp,et al.  Single-View Food Portion Estimation Based on Geometric Models , 2015, 2015 IEEE International Symposium on Multimedia (ISM).

[16]  Mohammed Bennamoun,et al.  Forest Change Detection in Incomplete Satellite Images With Deep Neural Networks , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[17]  Maria F. Vasiloglou,et al.  A Comparative Study on Carbohydrate Estimation: GoCARB vs. Dietitians , 2018, Nutrients.

[18]  Edward J. Delp,et al.  Weakly supervised food image segmentation using class activation maps , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[19]  T. Nicklas,et al.  The relationship of breakfast skipping and type of breakfast consumption with nutrient intake and weight status in children and adolescents: the National Health and Nutrition Examination Survey 1999-2006. , 2010, Journal of the American Dietetic Association.

[20]  David S. Ebert,et al.  The Use of Mobile Devices in Aiding Dietary Assessment and Evaluation , 2010, IEEE Journal of Selected Topics in Signal Processing.

[21]  Keiji Yanai,et al.  An Automatic Calorie Estimation System of Food Images on a Smartphone , 2016, MADiMa @ ACM Multimedia.

[22]  Zhiming Luo,et al.  Non-local Deep Features for Salient Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Håkan Jönsson,et al.  Food and health: individual, cultural, or scientific matters? , 2013, Genes & Nutrition.

[25]  Keiji Yanai,et al.  Automatic Expansion of a Food Image Dataset Leveraging Existing Categories with Domain Adaptation , 2014, ECCV Workshops.

[26]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[27]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  B. Popkin,et al.  Food portion patterns and trends among U.S. children and the relationship to total eating occasion size, 1977-2006. , 2011, The Journal of nutrition.

[29]  T. Sunil,et al.  The Fast Food and Obesity Link: Consumption Patterns and Severity of Obesity , 2012, Obesity Surgery.

[30]  Mingui Sun,et al.  Saliency-aware food image segmentation for personal dietary assessment using a wearable computer , 2015, Measurement science & technology.

[31]  Wei Xu,et al.  Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Mohammed Bennamoun,et al.  Learning deep structured network for weakly supervised change detection , 2016, IJCAI.

[33]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[34]  B. Fiese,et al.  Is Frequency of Shared Family Meals Related to the Nutritional Health of Children and Adolescents? , 2011, Pediatrics.

[35]  Ali Borji,et al.  Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study , 2013, IEEE Transactions on Image Processing.

[36]  Matthieu Guillaumin,et al.  Food-101 - Mining Discriminative Components with Random Forests , 2014, ECCV.

[37]  F. Rodríguez‐Artalejo,et al.  Selected eating behaviours and excess body weight: a systematic review , 2012, Obesity reviews : an official journal of the International Association for the Study of Obesity.

[38]  Keiji Yanai,et al.  Recognition of Multiple-Food Images by Detecting Candidate Regions , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[39]  Yael Pritch,et al.  Saliency filters: Contrast based filtering for salient region detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Jindong Tan,et al.  DietCam: Automatic dietary assessment with mobile camera phones , 2012, Pervasive Mob. Comput..

[41]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  D. Ballard,et al.  Memory Representations in Natural Tasks , 1995, Journal of Cognitive Neuroscience.

[43]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Huchuan Lu,et al.  Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[45]  Janine L. Wright,et al.  The connecting health and technology study: a 6-month randomized controlled trial to improve nutrition behaviours using a mobile food record and text messaging support in young adults , 2016, International Journal of Behavioral Nutrition and Physical Activity.

[46]  E. Delp,et al.  Novel Technologies for Assessing Dietary Intake: Evaluating the Usability of a Mobile Telephone Food Record Among Adults and Adolescents , 2012, Journal of medical Internet research.

[47]  Behjat Siddiquie,et al.  “Snap-n-Eat” , 2015, Journal of diabetes science and technology.

[48]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[49]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[50]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Yizhou Yu,et al.  Visual saliency based on multiscale deep features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Edward J. Delp,et al.  Multiple Hypotheses Image Segmentation and Classification With Application to Dietary Assessment , 2015, IEEE Journal of Biomedical and Health Informatics.

[53]  Takayuki Okatani,et al.  Change Detection from a Street Image Pair using CNN Features and Superpixel Segmentation , 2015, BMVC.

[54]  Guoqiang Han,et al.  R³Net: Recurrent Residual Refinement Network for Saliency Detection , 2018, IJCAI.

[55]  Wataru Shimoda,et al.  CNN-Based Food Image Segmentation Without Pixel-Wise Annotation , 2015, ICIAP Workshops.