Exploring Dietary Intake Data collected by FPQ using Unsupervised Learning

Populations in countries undergoing rapid transition are experiencing food- and nutrition-related problems. To acquire high-quality nutrition information, we need beside adequate data about food consumption, also efficient methods for the extraction of information from the collected data.Our aim was to develop a methodology for analyzing and reasoning about dietary intake data collected by a food propensity questionnaire (FPQ) and dependent 24-hour recalls (24HRs). We analysed a subset of data (about 197 participants) in the SI.Menu survey carried out in 2016/17 in Slovenia. The participants completed FPQs and 24HRs.We were able to identify four clusters. Two clusters represented participants with more healthy habits, e.g., low intake of animal fats, high breakfast frequency, and high intake of fruits and vegetables. The other two clusters represented participants with less healthy habits, e.g., high intake of animal fats, low breakfast frequency and increased BMI.The four clusters can be well separated by only four variables. This interesting discovery could lead to simplified FFQ questionnaires, which could significantly decrease the participants’ burden and could ensure participant compliance in similar studies. Having big national data set related to nutrition should ease the process of creating sustainable policies that will ultimately benefit agriculture, human health and the environment.

[1]  S. Marshall,et al.  Patterns of Sedentary Behaviour and Physical Activity Among Adolescents in the United Kingdom: Project STIL , 2007, Journal of Behavioral Medicine.

[2]  S. Plachta-Danielzik,et al.  Clustering of lifestyle factors and association with overweight in adolescents of the Kiel Obesity Prevention Study , 2010, Public Health Nutrition.

[3]  Hilko van der Voet,et al.  A European tool for usual intake distribution estimation in relation to data collection by EFSA , 2012 .

[4]  D. Midthune,et al.  The food propensity questionnaire: concept, development, and validation for use as a covariate in a model to estimate usual food intake. , 2006, Journal of the American Dietetic Association.

[5]  J. Kearney,et al.  Food consumption trends and drivers , 2010, Philosophical Transactions of the Royal Society B: Biological Sciences.

[6]  G. Johansson Comparison of nutrient intake between different dietary assessment methods in elderly male volunteers , 2008 .

[7]  A. Yancey,et al.  Resilience and patterns of health risk behaviors in California adolescents. , 2009, Preventive medicine.

[8]  Derek Greene,et al.  Unsupervised Learning and Clustering , 2008, Machine Learning Techniques for Multimedia.

[9]  Raymond J Carroll,et al.  A comparison of a food frequency questionnaire with a 24-hour recall for use in an epidemiological cohort study: results from the biomarker-based Observing Protein and Energy Nutrition (OPEN) study. , 2003, International journal of epidemiology.

[10]  V. Fulgoni,et al.  Dairy consumption and related nutrient intake in African-American adults and children in the United States: continuing survey of food intakes by individuals 1994-1996, 1998, and the National Health And Nutrition Examination Survey 1999-2000. , 2007, Journal of the American Dietetic Association.

[11]  N E Day,et al.  Comparison of dietary assessment methods in nutritional epidemiology: weighed records v. 24 h recalls, food-frequency questionnaires and estimated-diet records , 1994, British Journal of Nutrition.

[12]  Anna Timperio,et al.  The clustering of diet, physical activity and sedentary behavior in children and adolescents: a review , 2014, International Journal of Behavioral Nutrition and Physical Activity.

[13]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[14]  Vladimir Estivill-Castro,et al.  Why so many clustering algorithms: a position paper , 2002, SKDD.