Food Category Recognition Using SURF and MSER Local Feature Representation

Food object recognition has gained popularity in recent years. This can perhaps be attributed to its potential applications in fields such as nutrition and fitness. Recognizing food images however is a challenging task since various foods come in many shapes and sizes. Besides having unexpected deformities and texture, food images are also captured in differing lighting conditions and camera viewpoints. From a computer vision perspective, using global image features to train a supervised classifier might be unsuitable due to the complex nature of the food images. Local features on the other hand seem the better alternative since they are able to capture minute intricacies such as interest points and other intricate information. In this paper, two local features namely SURF (Speeded- Up Robust Feature) and MSER (Maximally Stable Extremal Regions) are investigated for food object recognition. Both features are computationally inexpensive and have shown to be effective local descriptors for complex images. Specifically, each feature is firstly evaluated separately. This is followed by feature fusion to observe whether a combined representation could better represent food images. Experimental evaluations using a Support Vector Machine classifier shows that feature fusion generates better recognition accuracy at 86.6%.

[1]  Kiyoharu Aizawa,et al.  Food Balance Estimation by Using Personal Dietary Tendencies in a Multimedia Food Log , 2013, IEEE Transactions on Multimedia.

[2]  Wanqing Li,et al.  Food image classification using local appearance and global structural information , 2014, Neurocomputing.

[3]  Keiji Yanai,et al.  Twitter Food Photo Mining and Analysis for One Hundred Kinds of Foods , 2014, PCM.

[4]  Pengpeng Zhao,et al.  A Comparative Study of SIFT and its Variants , 2013 .

[5]  Noridayu Manshor,et al.  Object Detection Framework for Multiclass Food Object Localization and Classification , 2018 .

[6]  Talmai Oliveira,et al.  A mobile, lightweight, poll-based food identification system , 2014, Pattern Recognit..

[7]  Jindong Tan,et al.  DietCam: Multi-view regular shape food recognition with a camera phone , 2015, Pervasive Mob. Comput..

[8]  Enrico Puppo,et al.  New Trends in Image Analysis and Processing -- ICIAP 2015 Workshops , 2015, Lecture Notes in Computer Science.

[9]  Adnan Yazici,et al.  An improved BOW approach using fuzzy feature encoding and visual-word weighting , 2015, 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[10]  Xi Zhang,et al.  Feature integration analysis of bag-of-features model for image retrieval , 2013, Neurocomputing.

[11]  Jindong Tan,et al.  DietCam: Automatic dietary assessment with mobile camera phones , 2012, Pervasive Mob. Comput..

[12]  Keiji Yanai,et al.  Recognition of Multiple-Food Images by Detecting Candidate Regions , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[13]  Shuang Wang,et al.  Geolocalized Modeling for Dish Recognition , 2015, IEEE Transactions on Multimedia.

[14]  Jindong Tan,et al.  DietCam: Regular Shape Food Recognition with a Camera Phone , 2011, 2011 International Conference on Body Sensor Networks.

[15]  Keiji Yanai,et al.  FoodCam: A Real-Time Mobile Food Recognition System Employing Fisher Vector , 2014, MMM.

[16]  Keiji Yanai,et al.  FoodCam: A real-time food recognition system on a smartphone , 2015, Multimedia Tools and Applications.

[17]  David Nistér,et al.  Linear Time Maximally Stable Extremal Regions , 2008, ECCV.

[18]  Razali Yaakob,et al.  Analysis of SURF and SIFT Representations to Recognize Food Objects , 2017 .

[19]  Hayko Riemenschneider,et al.  Shape Guided Maximally Stable Extremal Region (MSER) Tracking , 2010, 2010 20th International Conference on Pattern Recognition.

[20]  Marios Anthimopoulos,et al.  A Food Recognition System for Diabetic Patients Based on an Optimized Bag-of-Features Model , 2014, IEEE Journal of Biomedical and Health Informatics.

[21]  Shervin Shirmohammadi,et al.  Measuring Calorie and Nutrition From Food Image , 2014, IEEE Transactions on Instrumentation and Measurement.

[22]  David S. Ebert,et al.  The Use of Mobile Devices in Aiding Dietary Assessment and Evaluation , 2010, IEEE Journal of Selected Topics in Signal Processing.

[23]  Lei Yang,et al.  PFID: Pittsburgh fast-food image dataset , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[24]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[25]  Chong-Wah Ngo,et al.  Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study , 2010, IEEE Transactions on Multimedia.

[26]  Wanqing Li,et al.  On the Combination of Local Texture and Global Structure for Food Classification , 2010, 2010 IEEE International Symposium on Multimedia.

[27]  Edward J. Delp,et al.  Combining global and local features for food identification in dietary assessment , 2011, 2011 18th IEEE International Conference on Image Processing.

[28]  Giovanni Maria Farinella,et al.  Retrieval and classification of food images , 2016, Comput. Biol. Medicine.

[29]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..