Identifying people at risk of developing type 2 diabetes: A comparison of predictive analytics techniques and predictor variables

[1]  Habibollah Esmaily,et al.  A Comparison between Decision Tree and Random Forest in Determining the Risk Factors Associated with Type 2 Diabetes , 2018, Journal of research in health sciences.

[2]  Riccardo Bellazzi,et al.  Machine Learning Methods to Predict Diabetes Complications , 2018, Journal of diabetes science and technology.

[3]  G. Guyatt,et al.  Discrimination and Calibration of Clinical Prediction Models: Users’ Guides to the Medical Literature , 2017, JAMA.

[4]  Jay Daniel,et al.  Data Completeness in Healthcare: A Literature Survey , 2017, Pac. Asia J. Assoc. Inf. Syst..

[5]  Manal Alghamdi,et al.  Predicting diabetes mellitus using SMOTE and ensemble machine learning approach: The Henry Ford ExercIse Testing (FIT) project , 2017, PloS one.

[6]  Sayadi Mehrab,et al.  SIMPLE PREDICTION OF TYPE 2 DIABETES MELLITUS VIA DECISION TREE MODELING , 2017 .

[7]  Sabri Boughorbel,et al.  Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric , 2017, PloS one.

[8]  Tarek Sadraoui,et al.  A Multilayer Perceptron Artificial Neural Networks Based a Preprocessing and Hybrid Optimization Task for Data Mining and Classification , 2017 .

[9]  Chumphol Bunkhumpornpat,et al.  DBMUTE: density-based majority under-sampling technique , 2017, Knowledge and Information Systems.

[10]  Ali Dag,et al.  Predicting heart transplantation outcomes through data analytics , 2017, Decis. Support Syst..

[11]  I. Vlahavas,et al.  Machine Learning and Data Mining Methods in Diabetes Research , 2017, Computational and structural biotechnology journal.

[12]  T. Greenhalgh,et al.  Efficacy and effectiveness of screen and treat policies in prevention of type 2 diabetes: systematic review and meta-analysis of screening tests and interventions , 2017, British Medical Journal.

[13]  Nicholas J Wareham,et al.  A Systematic Review of Biomarkers and Risk of Incident Type 2 Diabetes: An Overview of Epidemiological, Prediction and Aetiological Research Literature , 2016, PloS one.

[14]  Roy Taylor,et al.  Type 2 Diabetes: The Pathologic Basis of Reversible β-Cell Dysfunction , 2016, Diabetes Care.

[15]  Anne M. P. Canuto,et al.  Fusion Approaches of Feature Selection Algorithms for Classification Problems , 2016, 2016 5th Brazilian Conference on Intelligent Systems (BRACIS).

[16]  Ya Zhang,et al.  A Machine Learning-based Framework to Identify Type 2 Diabetes through Electronic Health Records , 2016, bioRxiv.

[17]  Xiaoqing Zhou,et al.  An under-sampling imbalanced learning of data gravitation based classification , 2016, 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD).

[18]  Elham Heidari,et al.  Accurate prediction of nanofluid viscosity using a multilayer perceptron artificial neural network (MLP-ANN) , 2016 .

[19]  Ali Dag,et al.  A probabilistic data-driven framework for scoring the preoperative recipient-donor heart transplant survival , 2016, Decis. Support Syst..

[20]  Holly E. Gurgle,et al.  Diabetes Mellitus: Screening and Diagnosis. , 2016, American family physician.

[21]  O. Hejlesen,et al.  Toward Big Data Analytics , 2016, Journal of diabetes science and technology.

[22]  Anthony Man-Cho So,et al.  A unified approach to error bounds for structured convex optimization problems , 2015, Mathematical Programming.

[23]  Claudio Cobelli,et al.  A Bayesian Network for Probabilistic Reasoning and Imputation of Missing Risk Factors in Type 2 Diabetes , 2015, AIME.

[24]  Duncan Fyfe Gillies,et al.  A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data , 2015, Adv. Bioinformatics.

[25]  R. Bellazzi,et al.  Big Data Technologies , 2015, Journal of diabetes science and technology.

[26]  Marc Suhrcke,et al.  The Economic Costs of Type 2 Diabetes: A Global Systematic Review , 2015, PharmacoEconomics.

[27]  Paul H. Lee,et al.  Resampling Methods Improve the Predictive Power of Modeling in Class-Imbalanced Datasets , 2014, International journal of environmental research and public health.

[28]  M. Touvier,et al.  Relationships between adipokines, biomarkers of endothelial function and inflammation and risk of type 2 diabetes. , 2014, Diabetes research and clinical practice.

[29]  Taghi M. Khoshgoftaar,et al.  Comparison of Data Sampling Approaches for Imbalanced Bioinformatics Data , 2014, FLAIRS.

[30]  Alex Alves Freitas,et al.  Comprehensible classification models: a position paper , 2014, SKDD.

[31]  E. Rimm,et al.  Genetically Elevated Fetuin-A Levels, Fasting Glucose Levels, and Risk of Type 2 Diabetes , 2013, Diabetes Care.

[32]  V. Lagani,et al.  A systematic review of predictive risk models for diabetes complications based on large scale clinical studies. , 2013, Journal of diabetes and its complications.

[33]  Aline Castello Branco Mancuso,et al.  Review of combining forecasts approaches , 2013 .

[34]  Rushi Longadge,et al.  Class Imbalance Problem in Data Mining Review , 2013, ArXiv.

[35]  Renato Tinós,et al.  Using Machine Learning Classifiers to Assist Healthcare-Related Decisions: Classification of Electronic Patient Records , 2012, Journal of Medical Systems.

[36]  J. Dixon,et al.  Predicting the Glycemic Response to Gastric Bypass Surgery in Patients With Type 2 Diabetes , 2012, Diabetes Care.

[37]  P. Zimmet,et al.  The worldwide epidemiology of type 2 diabetes mellitus—present and future perspectives , 2012, Nature Reviews Endocrinology.

[38]  Trisha Greenhalgh,et al.  Risk models and scores for type 2 diabetes: systematic review , 2011, BMJ : British Medical Journal.

[39]  G. Collins,et al.  Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting , 2011, BMC medicine.

[40]  Mohammad Khalilia,et al.  Predicting disease risks from highly imbalanced data using random forest , 2011, BMC Medical Informatics Decis. Mak..

[41]  N. Obuchowski,et al.  Assessing the Performance of Prediction Models: A Framework for Traditional and Novel Measures , 2010, Epidemiology.

[42]  Dursun Delen,et al.  Predicting the graft survival for heart-lung transplantation patients: An integrated data mining methodology , 2009, Int. J. Medical Informatics.

[43]  Matthew A. North,et al.  A Method for Implementing a Statistically Significant Number of Data Classes in the Jenks Algorithm , 2009, 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery.

[44]  Tianxi Cai,et al.  Joint Effects of Common Genetic Variants on the Risk for Type 2 Diabetes in U.S. Men and Women of European Ancestry , 2009, Annals of Internal Medicine.

[45]  E. Bass,et al.  Risk factors for type 2 diabetes among women with gestational diabetes: a systematic review. , 2009, The American journal of medicine.

[46]  José Hernández-Orallo,et al.  An experimental comparison of performance measures for classification , 2009, Pattern Recognit. Lett..

[47]  D. Leroith,et al.  Obesity and type 2 diabetes are associated with an increased risk of developing cancer and a worse prognosis; epidemiological and mechanistic evidence. , 2008, Experimental and clinical endocrinology & diabetes : official journal, German Society of Endocrinology [and] German Diabetes Association.

[48]  Jean Tichet,et al.  Predicting Diabetes: Clinical, Biological, and Genetic Approaches , 2008, Diabetes Care.

[49]  David M. Eddy,et al.  Diabetes Risk Calculator , 2008, Diabetes Care.

[50]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[51]  Jian Huang,et al.  BMC Bioinformatics BioMed Central Methodology article Supervised group Lasso with applications to microarray data , 2007 .

[52]  William Stafford Noble,et al.  Support vector machine , 2013 .

[53]  Andrew Kusiak,et al.  Predicting survival time for kidney dialysis patients: a data mining approach , 2005, Comput. Biol. Medicine.

[54]  J. Avorn,et al.  A review of uses of health care utilization databases for epidemiologic research on therapeutics. , 2005, Journal of clinical epidemiology.

[55]  Ken Williams,et al.  Identification of individuals with insulin resistance using routine clinical measurements. , 2005, Diabetes.

[56]  Daniel T. Larose,et al.  k‐Nearest Neighbor Algorithm , 2005 .

[57]  Rich Caruana,et al.  Data mining in metric space: an empirical analysis of supervised learning performance criteria , 2004, ROCAI.

[58]  Stefano Tarantola,et al.  Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models , 2004 .

[59]  David M Eddy,et al.  Archimedes: a trial-validated model of diabetes. , 2003, Diabetes care.

[60]  Jaakko Tuomilehto,et al.  The diabetes risk score: a practical tool to predict type 2 diabetes risk. , 2003, Diabetes care.

[61]  S J Pöppl,et al.  Predicting Type 2 diabetes using an electronic nose-based artificial neural network analysis. , 2002, Diabetes, nutrition & metabolism.

[62]  H. White,et al.  Logistic regression in the medical literature: standards for use and reporting, with particular attention to one medical domain. , 2001, Journal of clinical epidemiology.

[63]  Igor Kononenko,et al.  Machine learning for medical diagnosis: history, state of the art and perspective , 2001, Artif. Intell. Medicine.

[64]  W. Pan Akaike's Information Criterion in Generalized Estimating Equations , 2001, Biometrics.

[65]  G. De’ath,et al.  CLASSIFICATION AND REGRESSION TREES: A POWERFUL YET SIMPLE TECHNIQUE FOR ECOLOGICAL DATA ANALYSIS , 2000 .

[66]  Michael J. Pazzani,et al.  Reducing Misclassification Costs , 1994, ICML.

[67]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[68]  Claude E. Shannon,et al.  The mathematical theory of communication , 1950 .

[69]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[70]  John P. A. Ioannidis,et al.  Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review , 2017, J. Am. Medical Informatics Assoc..

[71]  R. Thawonmas,et al.  Borderline Oversampling in Feature Space for Learning Algorithms in Imbalanced Data Environments , 2016 .

[72]  Varun Jaiswal,et al.  A first attempt to develop a diabetes prediction method based on different global datasets , 2016, 2016 Fourth International Conference on Parallel, Distributed and Grid Computing (PDGC).

[73]  Shan Suthaharan,et al.  Machine Learning Models and Algorithms for Big Data Classification , 2016 .

[74]  M. Mostafizur Rahman,et al.  Addressing the Class Imbalance Problem in Medical Datasets , 2013 .

[75]  Simon Fong,et al.  An Application of Oversampling, Undersampling, Bagging and Boosting in Handling Imbalanced Datasets , 2013, DaEng.

[76]  J. Havel,et al.  Artificial neural networks in medical diagnosis , 2013 .

[77]  Pradeep Kumar Ray,et al.  Towards an ontology for data quality in integrated chronic disease management: A realist review of the literature , 2013, Int. J. Medical Informatics.

[78]  Vaishali Ganganwar,et al.  An overview of classification algorithms for imbalanced datasets , 2012 .

[79]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[80]  Taghi M. Khoshgoftaar,et al.  RUSBoost: A Hybrid Approach to Alleviating Class Imbalance , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[81]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[82]  Akuh Adaji,et al.  The use of information technology to enhance diabetes management in primary care: a literature review. , 2008, Informatics in primary care.

[83]  Sarah Jane Delany k-Nearest Neighbour Classifiers , 2007 .

[84]  Dimitris Kanellopoulos,et al.  Handling imbalanced datasets: A review , 2006 .

[85]  Nitesh V. Chawla,et al.  Data Mining for Imbalanced Datasets: An Overview , 2005, The Data Mining and Knowledge Discovery Handbook.

[86]  Vicenç Torra,et al.  Trends in Information fusion in Data Mining , 2003 .

[87]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[88]  Isabelle Guyon,et al.  A Scaling Law for the Validation-Set Training-Set Size Ratio , 1997 .