Effective large for gestational age prediction using machine learning techniques with monitoring biochemical indicators

A newborn with a birth weight above the 90th percentile of same gestational age is termed as large for gestational age. Large for gestational age suffers from serious complications during and after the antepartum period because they do not get earlier identification of the disease. Earlier recognition of large for gestational age infants could slow progression and prevent further complication of the disease. In medical science, prevention and mitigation of disease require examination of biochemical indicators. Machine learning has been evolved and envisioned as a tool to predict large for gestational age infants with most deterministic characteristics. This study aims to identify most deterministic biochemical indicators for large for gestational age prediction with minimal computational overhead. To the best of my knowledge, this is the first time a study is carried out to identify the most deterministic risk factors associated with large for gestational age and to develop large for gestational age prediction model using machine learning techniques. To develop an efficient large for gestational age prediction model, we conducted three group of experiments that considered basic machine learning methods; feature selection; and imbalanced data, respectively. Support vector machine, logistic regression, Naive Bayes and Random Forest were trained using tenfold cross-validation on large for gestational age dataset; we selected precision and area under the curve as a performance evaluation metrics; information gain an entropy-based feature selection method was adopted to rank features; we introduced an ensemble data imbalance technique in the last group of experiments. For each group of experiments, support vector machine performed best compared to other machine learning classifiers by producing the highest prediction precision score of 85%. All of the classifiers performed best with thirty ranked features subset, which validates the applied method to recognize the most deterministic risk factors associated with large for gestational age prediction.

[1]  Zhong-Cheng Luo,et al.  Optimal birth weight percentile cut‐offs in defining small‐ or large‐for‐gestational‐age , 2010, Acta paediatrica.

[2]  Ruxandra Stoean,et al.  Modeling medical decision making by support vector machines, explaining by rules of evolutionary algorithms with feature selection , 2013, Expert Syst. Appl..

[3]  Jun Wei,et al.  FD4C: Automatic Fault Diagnosis Framework for Web Applications in Cloud Computing , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[4]  Fei Wang,et al.  Semi-supervised learning via mean field methods , 2016, Neurocomputing.

[5]  Ann L. Yaktine,et al.  Weight Gain During Pregnancy , 2009 .

[6]  G. Xing,et al.  Autism risk in small- and large-for-gestational-age infants. , 2012, American journal of obstetrics and gynecology.

[7]  Jianqiang Li,et al.  Comparison of Different Machine Learning Approaches to Predict Small for Gestational Age Infants , 2020, IEEE Transactions on Big Data.

[8]  Karin Bammann,et al.  Statistical Models: Theory and Practice , 2006 .

[9]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[10]  Marco Liberati,et al.  Incidence of infants born small- and large-for-gestational-age in an Italian cohort over a 20-year period and associated risk factors , 2016, Italian Journal of Pediatrics.

[11]  F. Battaglia,et al.  A practical classification of newborn infants by weight and gestational age. , 1967, The Journal of pediatrics.

[12]  Rossitza Setchi,et al.  Feature selection using Joint Mutual Information Maximisation , 2015, Expert Syst. Appl..

[13]  Tao Wang,et al.  Workload-aware anomaly detection for Web applications , 2014, J. Syst. Softw..

[14]  Xiaofeng Gu,et al.  An Intelligent System for Lung Cancer Diagnosis Using a New Genetic Algorithm Based Feature Selection Method , 2014, Journal of Medical Systems.

[15]  V. Insler,et al.  Complications associated with the macrosomic fetus. , 1986, The Journal of reproductive medicine.

[16]  Tao Wang,et al.  Self-adaptive cloud monitoring with online anomaly detection , 2018, Future Gener. Comput. Syst..

[17]  Shikun Zhang,et al.  [Design of the national free proception health examination project in China]. , 2015, Zhonghua yi xue za zhi.

[18]  Fei Wang,et al.  Towards Unsupervised Gene Selection: A Matrix Factorization Framework , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[19]  Ann L. Yaktine,et al.  Weight Gain During Pregnancy , 2009 .

[20]  Ahmad Taher Azar,et al.  Neuro-fuzzy feature selection approach based on linguistic hedges for medical diagnosis , 2014, Int. J. Model. Identif. Control..

[21]  Jianqiang Li,et al.  Enforcing Differential Privacy for Shared Collaborative Filtering , 2017, IEEE Access.

[22]  A. Meshari,et al.  Fetal macrosomia — maternal risks and fetal outcome , 1990, International journal of gynaecology and obstetrics: the official organ of the International Federation of Gynaecology and Obstetrics.

[23]  Supawadee Luangkwan,et al.  Risk Factors of Small for Gestational Age and Large for Gestational Age at Buriram Hospital. , 2015, Journal of the Medical Association of Thailand = Chotmaihet thangphaet.

[24]  Rajiv Raju Relative Importance of Fine Needle Aspiration Features for Breast Cancer Diagnosis: A Study Using Information Gain Evaluation and Machine Learning , 2012 .

[25]  F. Mimouni,et al.  Decreased Bone Ultrasound Velocity in Large-for-Gestational-Age Infants , 2004, Journal of Perinatology.

[26]  Mehdi Khashei,et al.  Diagnosing Diabetes Type II Using a Soft Intelligent Binary Classification Model , 2012 .

[27]  Ann Borders,et al.  Stress during pregnancy and gestational weight gain , 2018, Journal of Perinatology.

[28]  Harry Zhang,et al.  Naive Bayesian Classifiers for Ranking , 2004, ECML.

[29]  Yongcai Wang,et al.  Diversity-aware retrieval of medical records , 2015, Comput. Ind..

[30]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[31]  A. Chang,et al.  Macrosomic Babies , 1990, The Australian & New Zealand journal of obstetrics & gynaecology.

[32]  Amir Aviram,et al.  241: Prerecognition of large for gestational age (LGA) fetus and its consequences , 2017 .

[33]  P Q Peterson,et al.  Macrosomia—Maternal Characteristics and Infant Complications , 1985, Obstetrics and gynecology.

[34]  W H Dietz,et al.  Role of the prenatal environment in the development of obesity. , 1998, The Journal of pediatrics.

[35]  O Axelsson,et al.  Maternal factors associated with high birth weight , 1991, Acta obstetricia et gynecologica Scandinavica.

[36]  A. Gezer,et al.  Perinatal and maternal outcomes of fetal macrosomia. , 2001, European journal of obstetrics, gynecology, and reproductive biology.

[37]  Xinzhu Lin,et al.  [Chinese neonatal birth weight curve for different gestational age]. , 2015, Zhonghua er ke za zhi = Chinese journal of pediatrics.

[38]  James M. Robins,et al.  Birthweight as a risk factor for breast cancer , 1996, The Lancet.

[39]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[40]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..