Machine Learning Techniques for Prediction of Early Childhood Obesity.

OBJECTIVES This paper aims to predict childhood obesity after age two, using only data collected prior to the second birthday by a clinical decision support system called CHICA. METHODS Analyses of six different machine learning methods: RandomTree, RandomForest, J48, ID3, Naïve Bayes, and Bayes trained on CHICA data show that an accurate, sensitive model can be created. RESULTS Of the methods analyzed, the ID3 model trained on the CHICA dataset proved the best overall performance with accuracy of 85% and sensitivity of 89%. Additionally, the ID3 model had a positive predictive value of 84% and a negative predictive value of 88%. The structure of the tree also gives insight into the strongest predictors of future obesity in children. Many of the strongest predictors seen in the ID3 modeling of the CHICA dataset have been independently validated in the literature as correlated with obesity, thereby supporting the validity of the model. CONCLUSIONS This study demonstrated that data from a production clinical decision support system can be used to build an accurate machine learning model to predict obesity in children after age two.

[1]  M. Kogan,et al.  Racial/ethnic, socioeconomic, and behavioral determinants of childhood and adolescent obesity in the United States: analyzing independent and joint associations. , 2008, Annals of epidemiology.

[2]  Jennette P. Moreno,et al.  Cardiovascular Effects of Intensive Lifestyle Intervention in Type 2 Diabetes , 2014, Current Atherosclerosis Reports.

[3]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[4]  Philippe Froguel,et al.  Estimation of Newborn Risk for Child or Adolescent Obesity: Lessons from Longitudinal Birth Cohorts , 2012, PloS one.

[5]  Barbara A Gower,et al.  Exercise dose and diabetes risk in overweight and obese children: a randomized controlled trial. , 2012, JAMA.

[6]  Mikel Aickin,et al.  Dental caries in American Indian toddlers after a community-based beverage intervention. , 2010, Ethnicity & disease.

[7]  Wahidah Husain,et al.  Parameter Identification and Selection for Childhood Obesity Prediction Using Data Mining , 2012 .

[8]  Hong Qiao,et al.  Comparing data mining methods with logistic regression in childhood obesity prediction , 2009, Inf. Syst. Frontiers.

[9]  Bojan Novak,et al.  Application of artificial neural networks for childhood obesity prediction , 1995, Proceedings 1995 Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems.

[10]  A. Racine,et al.  Maternal depressive symptoms and child obesity in low-income urban families. , 2013, Academic pediatrics.

[11]  Bojan Novak,et al.  Childhood obesity prediction with artificial neural networks , 1996, Proceedings Ninth IEEE Symposium on Computer-Based Medical Systems.

[12]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[13]  R. Spitzer,et al.  The Patient Health Questionnaire-2: Validity of a Two-Item Depression Screener , 2003, Medical care.

[14]  C. Victora,et al.  Rapid growth in infancy and childhood and obesity in later life – a systematic review , 2005, Obesity reviews : an official journal of the International Association for the Study of Obesity.

[15]  Michael E. Miller,et al.  Effect of structured physical activity on prevention of major mobility disability in older adults: the LIFE study randomized clinical trial. , 2014, JAMA.

[16]  Charles H Hillman,et al.  Impact of the FITKids Physical Activity Intervention on Adiposity in Prepubertal Children , 2014, Pediatrics.

[17]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[18]  Lawrence Ryner,et al.  Dietary determinants of hepatic steatosis and visceral adiposity in overweight and obese youth at risk of type 2 diabetes. , 2014, The American journal of clinical nutrition.

[19]  Vibha Anand,et al.  Child Health Improvement through Computer Automation: The CHICA System , 2004, MedInfo.

[20]  Nur'Aini Abdul Rashid,et al.  A hybrid approach using Naïve Bayes and Genetic Algorithm for childhood obesity prediction , 2012, 2012 International Conference on Computer & Information Science (ICCIS).

[21]  R. Ness-Abramof,et al.  A Two-Year Randomized Trial of Obesity Treatment in Primary Care Practice , 2012 .

[22]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[23]  W. Husain,et al.  A survey on utilization of data mining for childhood obesity prediction , 2010, 8th Asia-Pacific Symposium on Information and Telecommunication Technologies.