Synthetic sampling from small datasets: A modified mega-trend diffusion approach using k-nearest neighbors

[1]  Der-Chiang Li,et al.  Rebuilding sample distributions for small dataset learning , 2018, Decis. Support Syst..

[2]  Der-Chiang Li,et al.  Employing box-and-whisker plots for learning more knowledge in TFT-LCD pilot runs , 2012 .

[3]  Der-Chiang Li,et al.  Employing virtual samples to build early high-dimensional manufacturing models , 2013 .

[4]  Brian Johnson,et al.  Classifying a high resolution image of an urban area using super-object information , 2013 .

[5]  Adi Wijaya,et al.  Behavior Determinant Based Cervical Cancer Early Detection with Machine Learning Algorithm , 2016 .

[6]  Luís Torgo,et al.  SMOTE for Regression , 2013, EPIA.

[7]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[8]  Der-Chiang Li,et al.  The Generalized-Trend-Diffusion modeling algorithm for small data sets in the early stages of manufacturing systems , 2010, Eur. J. Oper. Res..

[9]  Jörg Drechsler,et al.  An empirical evaluation of easily implemented, nonparametric methods for generating synthetic datasets , 2011, Comput. Stat. Data Anal..

[10]  Giuseppe Jurman,et al.  Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone , 2020, BMC Medical Informatics and Decision Making.

[11]  Brian Johnson,et al.  High-resolution urban land-cover classification using a competitive multi-scale object-based approach , 2013 .

[12]  Firuz Kamalov,et al.  Gamma distribution-based sampling for imbalanced data , 2020, Knowl. Based Syst..

[13]  M. Elter,et al.  The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. , 2007, Medical physics.

[14]  Ahyeon Koh,et al.  Sweat and saliva cortisol response to stress and nutrition factors , 2020, Scientific Reports.

[15]  Alok Baveja,et al.  Computing , Artificial Intelligence and Information Technology A data-driven software tool for enabling cooperative information sharing among police departments , 2002 .

[16]  Roberto Todeschini,et al.  Investigating the mechanisms of bioconcentration through QSAR classification trees. , 2016, Environment international.

[17]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[18]  Francisco Charte,et al.  MLSMOTE: Approaching imbalanced multilabel learning through synthetic instance generation , 2015, Knowl. Based Syst..

[19]  Yu Cheng,et al.  Boosting Deep Learning Risk Prediction with Generative Adversarial Networks for Electronic Health Records , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[20]  Linda Coyle,et al.  Generation and evaluation of synthetic patient data , 2020, BMC Medical Research Methodology.

[21]  David Gil Méndez,et al.  Predicting seminal quality with artificial intelligence methods , 2012, Expert Syst. Appl..

[22]  Saeid Nahavandi,et al.  An expert system for selecting wart treatment method , 2017, Comput. Biol. Medicine.

[23]  L. Cox Statistical Disclosure Limitation , 2006 .

[24]  Jimeng Sun,et al.  Generating Multi-label Discrete Patient Records using Generative Adversarial Networks , 2017, MLHC.

[25]  Der-Chiang Li,et al.  The attribute-trend-similarity method to improve learning performance for small datasets , 2017, Int. J. Prod. Res..

[26]  Der-Chiang Li,et al.  A tree-based-trend-diffusion prediction procedure for small sample sets in the early stages of manufacturing systems , 2012, Expert Syst. Appl..

[27]  Carri Glide-Hurst,et al.  Implementation of a novel algorithm for generating synthetic CT images from magnetic resonance imaging data sets for prostate cancer radiation therapy. , 2015, International journal of radiation oncology, biology, physics.

[28]  Huang Chong-fu,et al.  Principle of information diffusion , 1997 .

[29]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Der-Chiang Li,et al.  A genetic algorithm-based virtual sample generation technique to improve small data set learning , 2014, Neurocomputing.

[31]  O. Mangasarian,et al.  Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[32]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[33]  Stef van Buuren,et al.  Multiple imputation of discrete and continuous data by fully conditional specification , 2007 .

[34]  Der-Chiang Li,et al.  Using structure-based data transformation method to improve prediction accuracies for small data sets , 2012, Decis. Support Syst..

[35]  Der-Chiang Li,et al.  Using mega-trend-diffusion and artificial samples in small data set learning for early flexible manufacturing system scheduling knowledge , 2007, Comput. Oper. Res..

[36]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[37]  Max A. Little,et al.  Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection , 2007, Biomedical engineering online.

[38]  Stef van Buuren,et al.  MICE: Multivariate Imputation by Chained Equations in R , 2011 .

[39]  Roberto Todeschini,et al.  QSAR models for bioconcentration: is the increase in the complexity justified by more accurate predictions? , 2015, Chemosphere.

[40]  O. Mangasarian,et al.  Pattern Recognition Via Linear Programming: Theory and Application to Medical Diagnosis , 1989 .

[41]  Performance Evaluation of Supervised Machine Learning Classifiers for Predicting Healthcare Operational Decisions , 2020 .

[42]  Kudakwashe Dube,et al.  Approach and Method for Generating Realistic Synthetic Electronic Healthcare Records for Secondary Use , 2013, FHIES.

[43]  Taochun Wang,et al.  An automatic sampling ratio detection method based on genetic algorithm for imbalanced data classification , 2021, Knowl. Based Syst..

[44]  Yonghe Liu,et al.  Improving interpolation-based oversampling for imbalanced data learning , 2020, Knowl. Based Syst..

[45]  Claudio Moraga,et al.  A diffusion-neural-network for learning from small samples , 2004, Int. J. Approx. Reason..

[46]  Mark Kramer,et al.  Synthea: An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record , 2017, J. Am. Medical Informatics Assoc..