Predicting nationwide obesity from food sales using machine learning

The obesity epidemic progresses everywhere across the globe, and implementing frequent nationwide surveys to measure the percentage of obese population is costly. Conversely, country-level food sales information can be accessed inexpensively through different suppliers on a regular basis. This study applies a methodology to predict obesity prevalence at the country-level based on national sales of a small subset of food and beverage categories. Three machine learning algorithms for nonlinear regression were implemented using purchase and obesity prevalence data from 79 countries: support vector machines, random forests and extreme gradient boosting. The proposed method was validated in terms of both the absolute prediction error and the proportion of countries for which the obesity prevalence was predicted satisfactorily. We found that the most-relevant food category to predict obesity is baked goods and flours, followed by cheese and carbonated drinks.

[1]  Kevin D Hall,et al.  Increased food energy supply as a major driver of the obesity epidemic: a global analysis , 2015, Bulletin of the World Health Organization.

[2]  David Buckeridge,et al.  Estimating spatial patterning of dietary behaviors using grocery transaction data , 2017, Online Journal of Public Health Informatics.

[3]  Oliver T Mytton,et al.  Overall and income specific effect on prevalence of overweight and obesity of 20% sugar sweetened drink tax in UK: econometric and comparative risk assessment modelling study , 2013, BMJ : British Medical Journal.

[4]  John P A Ioannidis,et al.  The Challenge of Reforming Nutritional Epidemiologic Research. , 2018, JAMA.

[5]  Gregory Traversy,et al.  Alcohol Consumption and Obesity: An Update , 2015, Current Obesity Reports.

[6]  A Tremblay,et al.  Globalization and modernization: an obesogenic combination , 2011, Obesity reviews : an official journal of the International Association for the Study of Obesity.

[7]  N. Peek,et al.  An exploration of mortality risk factors in non-severe pneumonia in children using clinical data from Kenya , 2017, BMC Medicine.

[8]  Judith Wylie-Rosett,et al.  Carbohydrates and increases in obesity: does the type of carbohydrate make a difference? , 2004, Obesity research.

[9]  Neil Mann,et al.  Origins and evolution of the Western diet: health implications for the 21st century. , 2005, The American journal of clinical nutrition.

[10]  E. Ding,et al.  Convergence of obesity and high glycemic diet on compounding diabetes and cardiovascular risks in modernizing China: An emerging public health dilemma , 2008, Globalization and health.

[11]  B. Popkin,et al.  Ultra‐processed products are becoming dominant in the global food system , 2013, Obesity reviews : an official journal of the International Association for the Study of Obesity.

[12]  Martin McKee,et al.  Relationship of soft drink consumption to global overweight, obesity, and diabetes: a cross-national analysis of 75 countries. , 2013, American journal of public health.

[13]  Marc Suhrcke,et al.  Economic development, urbanization, technological change and overweight: What do we learn from 244 Demographic and Health Surveys? , 2014, Economics and human biology.

[14]  Martin McKee,et al.  Nutritional determinants of worldwide diabetes: an econometric study of food markets and diabetes prevalence in 173 countries , 2012, Public Health Nutrition.

[15]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[16]  J. Sekhon,et al.  Evaluating treatment effectiveness under model misspecification: A comparison of targeted maximum likelihood estimation with bias-corrected matching , 2014, Statistical methods in medical research.

[17]  Carlos Augusto Monteiro,et al.  A new classification of foods based on the extent and purpose of their processing. , 2010, Cadernos de saude publica.

[18]  U. Ekelund,et al.  Global physical activity levels: surveillance progress, pitfalls, and prospects , 2012, The Lancet.

[19]  Yang Wang,et al.  Diet-beverage consumption and caloric intake among US adults, overall and by body weight. , 2014, American journal of public health.

[20]  Kevin D Hall,et al.  Predicting metabolic adaptation, body weight change, and energy intake in humans. , 2010, American journal of physiology. Endocrinology and metabolism.

[21]  T M Dugan,et al.  Machine Learning Techniques for Prediction of Early Childhood Obesity. , 2015, Applied clinical informatics.

[22]  James A. West,et al.  Integration of metabolomics, lipidomics and clinical data using a machine learning method , 2016, BMC Bioinformatics.

[23]  Fabrizio Pasanisi,et al.  Obesity epidemics: secular trend or globalization consequence? Beyond the interaction between genetic and environmental factors. , 2004, Clinical nutrition.

[24]  Svetha Venkatesh,et al.  Is Demography Destiny? Application of Machine Learning Techniques to Accurately Predict Population Health Outcomes from a Minimal Demographic Dataset , 2015, PloS one.

[25]  Hugh Ellis,et al.  Exploring the forest instead of the trees: An innovative method for defining obesogenic and obesoprotective environments. , 2015, Health & place.