Gradient boosting decision tree becomes more reliable than logistic regression in predicting probability for diabetes with big data

[1]  B. Duncan,et al.  IDF diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045 , 2021, Diabetes Research and Clinical Practice.

[2]  Amitai Armon,et al.  Tabular Data: Deep Learning is Not All You Need , 2021, Inf. Fusion.

[3]  D. Khalili,et al.  Prediction Models for Type 2 Diabetes Risk in the General Population: A Systematic Review of Observational Studies , 2021, International journal of endocrinology and metabolism.

[4]  Kushan De Silva,et al.  Use and performance of machine learning models for type 2 diabetes prediction in community settings: A systematic review and meta-analysis , 2020, Int. J. Medical Informatics.

[5]  A. Sheikh,et al.  Early detection of type 2 diabetes mellitus using machine learning-based prediction models , 2020, Scientific Reports.

[6]  A. Goto,et al.  Japanese Clinical Practice Guideline for Diabetes 2019 , 2020, Diabetology International.

[7]  Richard A. Bauder,et al.  Investigating class rarity in big data , 2020, Journal of Big Data.

[8]  T. Wong,et al.  Logistic regression was as good as machine learning for predicting major chronic diseases. , 2020, Journal of clinical epidemiology.

[9]  Maarten van Smeden,et al.  Calibration: the Achilles heel of predictive analytics , 2019, BMC Medicine.

[10]  Takuya Akiba,et al.  Optuna: A Next-generation Hyperparameter Optimization Framework , 2019, KDD.

[11]  S. Kaushik,et al.  Big data in healthcare: management, analysis and future prospects , 2019, Journal of Big Data.

[12]  K. Ngiam,et al.  Big data and machine learning algorithms for health-care delivery. , 2019, The Lancet. Oncology.

[13]  Jie Ma,et al.  A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. , 2019, Journal of clinical epidemiology.

[14]  I. Kohane,et al.  Big Data and Machine Learning in Health Care. , 2018, JAMA.

[15]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[16]  M. Delgado-Rodríguez,et al.  Systematic review and meta-analysis. , 2017, Medicina intensiva.

[17]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[18]  I. Vlahavas,et al.  Machine Learning and Data Mining Methods in Diabetes Research , 2017, Computational and structural biotechnology journal.

[19]  Z. Obermeyer,et al.  Predicting the Future - Big Data, Machine Learning, and Clinical Medicine. , 2016, The New England journal of medicine.

[20]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[21]  O. Hejlesen,et al.  Toward Big Data Analytics , 2016, Journal of diabetes science and technology.

[22]  Gary S Collins,et al.  Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and Elaboration , 2015, Annals of Internal Medicine.

[23]  Ewout W Steyerberg,et al.  Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints , 2014, BMC Medical Research Methodology.

[24]  B. Vandermeer,et al.  Lifestyle Interventions for Patients With and at Risk for Type 2 Diabetes , 2013, Annals of Internal Medicine.

[25]  Ling Wang,et al.  Evaluating the risk of type 2 diabetes mellitus using artificial neural network: an effective classification approach. , 2013, Diabetes research and clinical practice.

[26]  Martha Sajatovic,et al.  Clinical Prediction Models , 2013 .

[27]  Stanley Lemeshow,et al.  Standardizing the power of the Hosmer–Lemeshow goodness of fit test in large data sets , 2013, Statistics in medicine.

[28]  E. Mohammadi,et al.  Barriers and facilitators related to the implementation of a physiological track and trigger system: A systematic review of the qualitative evidence , 2017, International journal for quality in health care : journal of the International Society for Quality in Health Care.

[29]  Trisha Greenhalgh,et al.  Risk models and scores for type 2 diabetes: systematic review , 2011, BMJ : British Medical Journal.

[30]  G. Collins,et al.  Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting , 2011, BMC medicine.

[31]  Simon J. Griffin,et al.  Risk Assessment Tools for Identifying Individuals at Risk of Developing Type 2 Diabetes , 2011, Epidemiologic reviews.

[32]  A. Thrift,et al.  Systematic Review of Observational Studies , 2010, Neuroepidemiology.

[33]  M. Fowler Microvascular and Macrovascular Complications of Diabetes , 2008, Clinical Diabetes.

[34]  W. Briggs Statistical Methods in the Atmospheric Sciences , 2007 .

[35]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[36]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[37]  V Kishore Ayyadevara,et al.  Gradient Boosting Machine , 2018 .

[38]  Tie-Yan Liu,et al.  A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS 2017.

[39]  Douglas G. Altman,et al.  Explanation and Elaboration , 2022 .