Where and What Am I Eating? Image-Based Food Menu Recognition

Food has become a very important aspect of our social activities. Since social networks and websites like Yelp appeared, their users have started uploading photos of their meals to the Internet. This phenomenon opens a whole world of possibilities for developing models for applying food analysis and recognition on huge amounts of real-world data. A clear application could consist in applying image food recognition by using the menu of the restaurants. Our model, based on Convolutional Neural Networks and Recurrent Neural Networks, is able to learn a language model that generalizes on never seen dish names without the need of re-training it. According to the Ranking Loss metric, the results obtained by the model improve the baseline by a 15%.

[1]  Shuang Wang,et al.  Geolocalized Modeling for Dish Recognition , 2015, IEEE Transactions on Multimedia.

[2]  Matthieu Guillaumin,et al.  Food-101 - Mining Discriminative Components with Random Forests , 2014, ECCV.

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[5]  Petia Radeva,et al.  Food Recognition Using Fusion of Classifiers Based on CNNs , 2017, ICIAP.

[6]  Petia Radeva,et al.  Food Ingredients Recognition Through Multi-label Learning , 2017, ICIAP Workshops.

[7]  Petia Radeva,et al.  Simultaneous food localization and recognition , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[8]  M. S. Burgher Food and health in Europe : a new basis for action Summary , 2003 .

[9]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[10]  Wen Wu,et al.  Fast food recognition from videos of eating for calorie estimation , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[11]  Rodney W. Johnson,et al.  Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy , 1980, IEEE Trans. Inf. Theory.

[12]  Huan-Chung Li,et al.  Automated Food Ontology Construction Mechanism for Diabetes Diet Care , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[13]  Vinod Vokkarane,et al.  DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment , 2016, ICOST.

[14]  Robyn Speer,et al.  ConceptNet at SemEval-2017 Task 2: Extending Word Embeddings with Multilingual Relational Knowledge , 2017, *SEMEVAL.

[15]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[16]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[17]  Gregory D. Abowd,et al.  Leveraging Context to Support Automated Food Recognition in Restaurants , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[18]  Beatriz Remeseiro,et al.  Grab, Pay, and Eat: Semantic Food Detection for Smart Restaurants , 2018, IEEE Transactions on Multimedia.

[19]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20]  Petia Radeva,et al.  VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering , 2017, IbPRIA.

[21]  Giovanni Maria Farinella,et al.  Food vs Non-Food Classification , 2016, MADiMa @ ACM Multimedia.

[22]  Gian Luca Foresti,et al.  Wide-Slice Residual Networks for Food Recognition , 2016, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[23]  Chong-Wah Ngo,et al.  Deep-based Ingredient Recognition for Cooking Recipe Retrieval , 2016, ACM Multimedia.

[24]  Petia Radeva,et al.  Exploring Food Detection Using CNNs , 2017, EUROCAST.

[25]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[26]  Keiji Yanai,et al.  Simultaneous estimation of food categories and calories with multi-task CNN , 2017, 2017 Fifteenth IAPR International Conference on Machine Vision Applications (MVA).

[27]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[28]  P ROZIN,et al.  Attitudes to Food and the Role of Food in Life in the U.S.A., Japan, Flemish Belgium and France: Possible Implications for the Diet–Health Debate , 1999, Appetite.

[29]  Amaia Salvador,et al.  Learning Cross-Modal Embeddings for Cooking Recipes and Food Images , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Keiji Yanai,et al.  Automatic Expansion of a Food Image Dataset Leveraging Existing Categories with Domain Adaptation , 2014, ECCV Workshops.