DietCam: Multi-view regular shape food recognition with a camera phone

Abstract This paper presents an automatic multi-view food classification of a food intake assessment system on a smart phone. Food intake assessment plays important roles in obesity management, which has shown significant impacts on public healthcare. Conventional dietary record-based food intake assessment methods are not popularly applied due to their inconvenience and high reliance on human interactions. This paper presents a smart phone application, named DietCam, to recognize food intakes automatically. The major difficulties in food recognition from images come from uncertainties of food appearances and deformable nature of food especially when they are on a complex background environment. The proposed DietCam system utilizes a multi-view recognition method that separates every food by estimating the best perspective and recognizing them using a probabilistic method. The implemented DietCam system on an iPhone 4 platform showed improved performance compared with baseline methods for food recognition, with an average accuracy of 84% for the selective regular shape foods.

[1]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  E. Finkelstein,et al.  National medical spending attributable to overweight and obesity: how much, and who's paying? , 2003, Health affairs.

[3]  Rob Lawson,et al.  Validation of brief instruments to measure adult fruit and vegetable consumption , 2011, Appetite.

[4]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Mei Chen,et al.  Food recognition using statistics of pairwise local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Allen Y. Yang,et al.  Towards an efficient distributed object recognition system in wireless smart camera networks , 2010, 2010 13th International Conference on Information Fusion.

[7]  David S. Ebert,et al.  The Use of Mobile Devices in Aiding Dietary Assessment and Evaluation , 2010, IEEE Journal of Selected Topics in Signal Processing.

[8]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[9]  Corby K. Martin,et al.  Quantification of food intake using food image analysis , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[10]  Dana H. Ballard,et al.  Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..

[11]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[12]  G. Strang Introduction to Linear Algebra , 1993 .

[13]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[14]  Takeo Kanade,et al.  A statistical method for 3D object detection applied to faces and cars , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[15]  Siddhartha S. Srinivasa,et al.  Efficient multi-view object recognition and full pose estimation , 2010, 2010 IEEE International Conference on Robotics and Automation.

[16]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[17]  Fereidoun Azizi,et al.  Reproducibility and Relative Validity of Food Group Intake in a Food Frequency Questionnaire Developed for the Tehran Lipid and Glucose Study , 2010, Journal of epidemiology.

[18]  Gaston Godin,et al.  A Simple Method to Assess Fruit and Vegetable Intake among Obese and Non-obese Individuals , 2008, Canadian journal of public health = Revue canadienne de sante publique.

[19]  James W Hardin,et al.  Relation of Children's Dietary reporting accuracy to cognitive ability. , 2011, American journal of epidemiology.

[20]  M. Cardoso,et al.  Assessing the validity of a food frequency questionnaire among low-income women in São Paulo, southeastern Brazil. , 2010, Cadernos de saude publica.

[21]  Hongbin Zha,et al.  Optimizing kd-trees for scalable visual descriptor indexing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Richard Szeliski,et al.  Computer Vision - Algorithms and Applications , 2011, Texts in Computer Science.

[23]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[24]  Tom Greene,et al.  Validation of a dietary history questionnaire for American Indian and Alaska Native people. , 2010, Ethnicity & disease.

[25]  J Wylie-Rosett,et al.  A lifestyle assessment and intervention tool for pediatric weight management: the HABITS questionnaire. , 2011, Journal of human nutrition and dietetics : the official journal of the British Dietetic Association.

[26]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[27]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[28]  Michael Brady,et al.  Saliency, Scale and Image Description , 2001, International Journal of Computer Vision.

[29]  Wen Wu,et al.  Fast food recognition from videos of eating for calorie estimation , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[30]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[31]  David P. Williams Bayesian Data Fusion of Multiview Synthetic Aperture Sonar Imagery for Seabed Classification , 2009, IEEE Transactions on Image Processing.

[32]  Lei Yang,et al.  PFID: Pittsburgh fast-food image dataset , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[33]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).