Data Mining Techniques for Classification of Childhood Obesity Among Year 6 School Children

Today, data mining is broadly applied in many fields, including healthcare and medical fields. Obesity problem among children is one of the issues commonly explored using data mining techniques. In this paper, the classification of childhood obesity among year six school children from two districts in Terengganu, Malaysia is discussed. The data were collected from two main sources; a Standard Kecergasan Fizikal Kebangsaan untuk Murid Sekolah Malaysia/National Physical Fitness Standard for Malaysian School Children (SEGAK) Assessment Program and a set of distributed questionnaire. From the collected data, 4,245 complete data sets were promptly analyzed. The data preprocessing and feature selection were implemented to the data sets. The classification techniques, namely Bayesian Network, Decision Tree, Neural Networks and Support Vector Machine (SVM) were implemented and compared on the data sets. This paper presents the evaluation of several feature selection methods based on different classifiers.

[1]  M. Vasantha,et al.  Evaluation of Attribute Selection Methods with Tree based Supervised Classification-A Case Study with Mammogram Images , 2010 .

[2]  Jyothi Pillai,et al.  Usage of Nearest Neighborhood, Decision Tree and Bayesian Classification Techniques in Development of Weight Management Counseling System , 2008, 2008 First International Conference on Emerging Trends in Engineering and Technology.

[3]  Blaz Zupan,et al.  Predictive data mining in clinical medicine: Current issues and guidelines , 2008, Int. J. Medical Informatics.

[4]  K. Kromeyer-Hauschild,et al.  Prevalence of overweight and obesity among school children in Jena (Germany) , 1999, International Journal of Obesity.

[5]  Geoffrey Holmes,et al.  Benchmarking attribute selection techniques for data mining , 2000 .

[6]  Harleen Kaur,et al.  Empirical Study on Applications of Data Mining Techniques in Healthcare , 2006 .

[7]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[8]  Sulabha S. Apte,et al.  Improved Study of Heart Disease Prediction System using Data Mining Classification Techniques , 2012 .

[9]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[10]  Bojan Novak,et al.  Application of artificial neural networks for childhood obesity prediction , 1995, Proceedings 1995 Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems.

[11]  B. Milović,et al.  Prediction and Decision Making in Health Care using Data Mining , 2012 .

[12]  Y. Liu,et al.  Data mining feature selection for credit scoring models , 2005, J. Oper. Res. Soc..

[13]  Gongzhu Hu,et al.  Data Mining for Lifestyle Risk Factors Associated with Overweight and Obesity among Adolescents , 2014, 2014 IIAI 3rd International Conference on Advanced Applied Informatics.

[14]  R. Rasat,et al.  Childhood obesity--prevalence among 7 and 8 year old primary school students in Kota Kinabalu. , 2012, The Medical journal of Malaysia.

[15]  Wahidah Husain,et al.  Data Mining for Medical Systems: A Review , 2012, CIT 2012.

[16]  Hamizatul Akmal Abdul Hamid,et al.  Overweight among primary school-age children in Malaysia. , 2013, Asia Pacific journal of clinical nutrition.

[17]  Bee Koon Poh,et al.  Prevalance and trends of overweight and obesity in two cross-sectional studies of Malaysian children, 2002–2008 , 2009 .

[18]  K. Flegal,et al.  Prevalence of childhood and adult obesity in the United States, 2011-2012. , 2014, JAMA.

[19]  R. Geetha Ramani,et al.  Data Mining in Clinical Data Sets: A Review , 2012 .

[20]  Wilfried N. Gansterer,et al.  On the Relationship Between Feature Selection and Classification Accuracy , 2008, FSDM.

[21]  Shyam Visweswaran,et al.  Improving Classification Performance with Discretization on Biomedical Datasets , 2008, AMIA.

[22]  Veronica Mocanu,et al.  Prevalence of Overweight and Obesity in Urban Elementary School Children in Northeastern Romania: Its Relationship with Socioeconomic Status and Associated Dietary and Lifestyle Factors , 2013, BioMed research international.

[23]  Illhoi Yoo,et al.  Data Mining in Healthcare and Biomedicine: A Survey of the Literature , 2012, Journal of Medical Systems.

[24]  Jianxin Chen,et al.  A Comparison of Four Data Mining Models: Bayes, Neural Network, SVM and Decision Trees in Identifying Syndromes in Coronary Heart Disease , 2007, ISNN.

[25]  Hong Qiao,et al.  Comparing data mining methods with logistic regression in childhood obesity prediction , 2009, Inf. Syst. Frontiers.

[26]  Jie Wang,et al.  Combination Data Mining Methods with New Medical Data to Predicting Outcome of Coronary Heart Disease , 2007, 2007 International Conference on Convergence Information Technology (ICCIT 2007).

[27]  Peter J.F. Lucas Bayesian analysis, pattern analysis, and data mining in health care , 2004, Current opinion in critical care.

[28]  W. Husain,et al.  A survey on utilization of data mining for childhood obesity prediction , 2010, 8th Asia-Pacific Symposium on Information and Telecommunication Technologies.

[29]  Sherina Mohd Sidik,et al.  Prevalence of obesity and associated factors among secondary school students in Slemani City Kurdistan Region, Iraq , 2014 .

[30]  Huan Liu,et al.  Feature subset selection bias for classification learning , 2006, ICML.