Estimating Glycemic Impact of Cooking Recipes via Online Crowdsourcing and Machine Learning

Consumption of diets with low glycemic impact is highly recommended for diabetics and pre-diabetics as it helps maintain their blood glucose levels. However, laboratory analysis of dietary glycemic potency is time-consuming and expensive. In this paper, we explore a data-driven approach utilizing online crowdsourcing and machine learning to estimate the glycemic impact of cooking recipes. We show that a commonly used healthiness metric may not always be effective in determining recipes suitable for diabetics, thus emphasizing the importance of the glycemic-impact estimation task. Our best classification model, trained on nutritional and crowdsourced data obtained from Amazon Mechanical Turk (AMT), can accurately identify recipes which are unhealthful for diabetics.

[1]  Christoph Trattner,et al.  The Impact of Recipe Features, Social Cues and Demographics on Estimating the Healthiness of Online Recipes , 2018, ICWSM.

[2]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[3]  Christoph Trattner,et al.  Investigating the Healthiness of Internet-Sourced Recipes: Implications for Meal Planning and Recommender Systems , 2017, WWW.

[4]  Kaye Foster-Powell,et al.  International table of glycemic index and glycemic load values: 2002. , 2002, The American journal of clinical nutrition.

[5]  Klaus Krippendorff,et al.  Computing Krippendorff's Alpha-Reliability , 2011 .

[6]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[7]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[8]  Lav R. Varshney,et al.  A Neural Network System for Transformation of Regional Cuisine Style , 2017, Front. ICT.

[9]  Amaia Salvador,et al.  Learning Cross-Modal Embeddings for Cooking Recipes and Food Images , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Steven C. H. Hoi,et al.  Learning Cross-Modal Embeddings With Adversarial Networks for Cooking Recipes and Food Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[13]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[14]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[15]  Timothy Baldwin,et al.  An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation , 2016, Rep4NLP@ACL.

[16]  Geoffrey Livesey,et al.  Glycemic response and health--a systematic review and meta-analysis: relations between dietary glycemic properties and health outcomes. , 2008, The American journal of clinical nutrition.

[17]  Kaye Foster-Powell,et al.  International Tables of Glycemic Index and Glycemic Load Values: 2008 , 2008, Diabetes Care.

[18]  Kjetil Nørvåg,et al.  Online Food Recipe Title Semantics: Combining Nutrient Facts and Topics , 2016, CIKM.

[19]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .