Non-imaging Medical Data Synthesis for Trustworthy AI: A Comprehensive Survey
暂无分享,去创建一个
J. Ser | M. Yong | Lichao Wang | Xiaodan Xing | I. Stenson | Guang Yang | Guang Yang | Simon Walsh | Huanjun Wu | Lichao Wang | Iain Stenson | Huanjun Wu | Javier Del Del Ser
[1] Naurin Farooq Khan,et al. Synthetic data generation: State of the art in health care domain , 2023, Comput. Sci. Rev..
[2] E. Lamine,et al. MedWGAN based synthetic dataset generation for Uveitis pathology , 2023, Intell. Syst. Appl..
[3] Debbie Rankin,et al. Synthetic Tabular Data Evaluation in the Health Domain Covering Resemblance, Utility, and Privacy Dimensions , 2022, Methods of Information in Medicine.
[4] Scott R. Smith,et al. Synthetic data in health care: A narrative review , 2023, PLOS digital health.
[5] Bradley A. Malin,et al. A Multifaceted benchmarking of synthetic electronic health record generation models , 2022, Nature communications.
[6] Li Fei-Fei,et al. Advances, challenges and opportunities in creating data for trustworthy AI , 2022, Nature Machine Intelligence.
[7] L. Szpruch,et al. Synthetic Data - what, why and how? , 2022, ArXiv.
[8] Debbie Rankin,et al. Synthetic data generation for tabular health records: A systematic review , 2022, Neurocomputing.
[9] M. Naik,et al. From Missing Data Imputation to Data Generation , 2022, J. Comput. Sci..
[10] Sophie J. Nightingale,et al. AI-synthesized faces are indistinguishable from real faces and more trustworthy , 2022, Proceedings of the National Academy of Sciences.
[11] Shijian Lu,et al. Multimodal Image Synthesis and Editing: A Survey , 2022, IEEE transactions on pattern analysis and machine intelligence.
[12] S. Vimal,et al. Conquering insufficient/imbalanced data learning for the Internet of Medical Things , 2022, Neural Computing and Applications.
[13] Wonkeun Jo,et al. OBGAN: Minority oversampling near borderline with generative adversarial networks , 2022, Expert Syst. Appl..
[14] Kaley J. Rittichier,et al. Trustworthy Artificial Intelligence: A Review , 2022, ACM Comput. Surv..
[15] Sean L. Hill,et al. Conceptualising fairness: three pillars for medical algorithms and health equity , 2022, BMJ Health & Care Informatics.
[16] Huaxin Xiao,et al. SA-CGAN: An oversampling method based on single attribute guided conditional GAN for multi-class imbalanced learning , 2021, Neurocomputing.
[17] Arif Subhan. Centers for Medicare & Medicaid Services , 2021, Journal of Clinical Engineering.
[18] Hao Luo,et al. Oversampling by a Constraint-Based Causal Network in Medical Imbalanced Data Classification , 2021, 2021 IEEE International Conference on Multimedia and Expo (ICME).
[19] Pengjiang Qian,et al. GAN-Based Medical Images Synthesis , 2021, International Journal of Health Systems and Translational Medicine.
[20] Prafulla Dhariwal,et al. Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.
[21] Min Kyung Lee,et al. Who Is Included in Human Perceptions of AI?: Trust and Perceived Fairness around Healthcare AI and Cultural Mistrust , 2021, CHI.
[22] Hongsheng Hu,et al. Membership Inference Attacks on Machine Learning: A Survey , 2021, ACM Comput. Surv..
[23] G. Clermont,et al. Sharing ICU Patient Data Responsibly Under the Society of Critical Care Medicine/European Society of Intensive Care Medicine Joint Data Science Collaboration: The Amsterdam University Medical Centers Database (AmsterdamUMCdb) Example* , 2021, Critical care medicine.
[24] A. Tucker,et al. Generating and evaluating cross‐sectional synthetic electronic healthcare data: Preserving data utility and patient privacy , 2021, Comput. Intell..
[25] Ninghui Li,et al. PrivSyn: Differentially Private Data Synthesis , 2020, USENIX Security Symposium.
[26] Dhamanpreet Kaur,et al. Application of Bayesian networks to generate synthetic health data , 2020, J. Am. Medical Informatics Assoc..
[27] Jimeng Sun,et al. EVA: Generating Longitudinal Electronic Health Records Using Conditional Variational Autoencoders , 2020, MLHC.
[28] Chao Yan,et al. SynTEG: a framework for temporal structured electronic health data simulation , 2020, J. Am. Medical Informatics Assoc..
[29] C. Troncoso,et al. Synthetic Data - Anonymisation Groundhog Day , 2020, USENIX Security Symposium.
[30] Puja Myles,et al. Generating high-fidelity synthetic patient data for assessing machine learning healthcare software , 2020, npj Digital Medicine.
[31] Anna V. Kaluzhnaya,et al. Bayesian Networks-based personal data synthesis , 2020, GOODTECHS.
[32] Stefan Lessmann,et al. Conditional Wasserstein GAN-based Oversampling of Tabular Data for Imbalanced Learning , 2020, Expert Syst. Appl..
[33] Stanley Kok,et al. Generating Privacy-Preserving Synthetic Tabular Data Using Oblivious Variational Autoencoders , 2020 .
[34] Deevakar Rogith,et al. Generating sequential electronic health records using dual adversarial autoencoder , 2020, J. Am. Medical Informatics Assoc..
[35] Linda Coyle,et al. Generation and evaluation of synthetic patient data , 2020, BMC Medical Research Methodology.
[36] Isabelle Guyon,et al. Generation and evaluation of privacy preserving synthetic health data , 2020, Neurocomputing.
[37] Ziqi Zhang,et al. Generating Electronic Health Records with Multiple Data Types and Constraints , 2020, AMIA.
[38] Jinsung Yoon,et al. Anonymization Through Data Synthesis Using Generative Adversarial Networks (ADS-GAN) , 2020, IEEE Journal of Biomedical and Health Informatics.
[39] Michael I. Jordan,et al. Decision-Making with Auto-Encoding Variational Bayes , 2020, NeurIPS.
[40] Jeannette M. Wing. Trustworthy AI , 2020, Commun. ACM.
[41] Eenjun Hwang,et al. BCGAN-Based Over-Sampling Scheme for Imbalanced Data , 2020, 2020 IEEE International Conference on Big Data and Smart Computing (BigComp).
[42] E. Fox,et al. CorGAN: Correlation-Capturing Convolutional Generative Adversarial Networks for Generating Synthetic Healthcare Records , 2020, FLAIRS.
[43] Xintao Wu,et al. FairGAN+: Achieving Fair Data Generation and Classification through Generative Adversarial Nets , 2019, 2019 IEEE International Conference on Big Data (Big Data).
[44] Sebastian Bosse,et al. EEGSourceSim: A framework for realistic simulation of EEG scalp data using MRI-based forward models and biologically plausible signals and noise , 2019, Journal of Neuroscience Methods.
[45] M. Kukar,et al. Diagnosing brain tumours by routine blood tests using machine learning , 2019, Scientific Reports.
[46] Chao Yan,et al. Ensuring electronic medical record simulation through better training, modeling, and evaluation , 2019, J. Am. Medical Informatics Assoc..
[47] Tomas E. Ward,et al. Synthesis of Realistic ECG using Generative Adversarial Networks , 2019, ArXiv.
[48] Mario Fritz,et al. GAN-Leaks: A Taxonomy of Membership Inference Attacks against Generative Models , 2019, CCS.
[49] Lei Xu,et al. Modeling Tabular data using Conditional GAN , 2019, NeurIPS.
[50] Lu Wang,et al. Continuous Patient-Centric Sequence Generation via Sequentially Coupled Adversarial Learning , 2019, DASFAA.
[51] Kemal Polat,et al. A Hybrid Approach to Parkinson Disease Classification Using Speech Signal: The Combination of SMOTE and Random Forests , 2019, 2019 Scientific Meeting on Electrical-Electronics & Biomedical Engineering and Computer Science (EBBT).
[52] Joyita Dutta,et al. Graph Convolutional Neural Networks For Alzheimer’s Disease Classification , 2019, 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019).
[53] Radu State,et al. Improving Missing Data Imputation with Deep Generative Models , 2019, ArXiv.
[54] Paris Perdikaris,et al. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations , 2019, J. Comput. Phys..
[55] Yoshua Bengio,et al. Maximum Entropy Generators for Energy-Based Models , 2019, ArXiv.
[56] Bo Li,et al. Differentially Private Data Generative Models , 2018, ArXiv.
[57] Lei Xu,et al. Synthesizing Tabular Data using Generative Adversarial Networks , 2018, ArXiv.
[58] Katherine E Henson,et al. Risk of Suicide After Cancer Diagnosis in England , 2018, JAMA psychiatry.
[59] Mihaela van der Schaar,et al. PATE-GAN: Generating Synthetic Data with Differential Privacy Guarantees , 2018, ICLR.
[60] Alistair E. W. Johnson,et al. The eICU Collaborative Research Database, a freely available multi-center database for critical care research , 2018, Scientific Data.
[61] Mihaela van der Schaar,et al. GAIN: Missing Data Imputation using Generative Adversarial Nets , 2018, ICML.
[62] Sushil Jajodia,et al. Data Synthesis based on Generative Adversarial Networks , 2018, Proc. VLDB Endow..
[63] Lu Zhang,et al. FairGAN: Fairness-aware Generative Adversarial Networks , 2018, 2018 IEEE International Conference on Big Data (Big Data).
[64] Gözde B. Ünal,et al. Patch-Based Image Inpainting with Generative Adversarial Networks , 2018, ArXiv.
[65] Fei Wang,et al. Differentially Private Generative Adversarial Network , 2018, ArXiv.
[66] Pengtao Xie,et al. On the Automatic Generation of Medical Imaging Reports , 2017, ACL.
[67] Jacob Schreiber,et al. Pomegranate: fast and flexible probabilistic modeling in python , 2017, J. Mach. Learn. Res..
[68] Cecilia M. Procopiuc,et al. PrivBayes , 2017 .
[69] Yu Cheng,et al. Boosting Deep Learning Risk Prediction with Generative Adversarial Networks for Electronic Health Records , 2017, 2017 IEEE International Conference on Data Mining (ICDM).
[70] Mark Kramer,et al. Synthea: An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record , 2017, J. Am. Medical Informatics Assoc..
[71] Paul Voigt,et al. The Eu General Data Protection Regulation (Gdpr): A Practical Guide , 2017 .
[72] Matjaž Kukar,et al. An application of machine learning to haematological diagnosis , 2017, Scientific Reports.
[73] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.
[74] Zhiwei Steven Wu,et al. Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing , 2017, bioRxiv.
[75] Bill Howe,et al. DataSynthesizer: Privacy-Preserving Synthetic Datasets , 2017, SSDBM.
[76] Haipeng Shen,et al. Artificial intelligence in healthcare: past, present and future , 2017, Stroke and Vascular Neurology.
[77] Jaime S. Cardoso,et al. Transfer Learning with Partial Observability Applied to Cervical Cancer Screening , 2017, IbPRIA.
[78] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[79] Gunnar Rätsch,et al. Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs , 2017, ArXiv.
[80] Charles A. Sutton,et al. VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning , 2017, NIPS.
[81] Jimeng Sun,et al. Generating Multi-label Discrete Patient Records using Generative Adversarial Networks , 2017, MLHC.
[82] Graham Neubig,et al. Neural Machine Translation and Sequence-to-sequence Models: A Tutorial , 2017, ArXiv.
[83] Nagiza F. Samatova,et al. Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data , 2016, IEEE Transactions on Knowledge and Data Engineering.
[84] Yu Cheng,et al. Generative Adversarial Networks as Variational Training of Energy Based Models , 2016, ArXiv.
[85] Jonathon Shlens,et al. Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.
[86] Vitaly Shmatikov,et al. Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).
[87] Kudakwashe Dube,et al. Using the CareMap with Health Incidents Statistics for Generating the Realistic Synthetic Electronic Healthcare Record , 2016, 2016 IEEE International Conference on Healthcare Informatics (ICHI).
[88] Kalyan Veeramachaneni,et al. The Synthetic Data Vault , 2016, 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA).
[89] Fernando Nogueira,et al. Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..
[90] Christina Thorpe,et al. COCOA: A Synthetic Data Generator for Testing Anonymization Techniques , 2016, PSD.
[91] David Riaño,et al. Simulation-Based Episodes of Care Data Synthetization for Chronic Disease Patients , 2016, KR4HC/ProHealth@HEC.
[92] Deok-Soo Kim,et al. Copula-Based Approach to Synthetic Population Generation , 2016, PloS one.
[93] Uri Kartoun,et al. A Methodology to Generate Virtual Patient Repositories , 2016, ArXiv.
[94] Ian Goodfellow,et al. Deep Learning with Differential Privacy , 2016, CCS.
[95] J. Schulman,et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.
[96] Peter Szolovits,et al. MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.
[97] Alexander Erath,et al. A Bayesian network approach for population synthesis , 2015 .
[98] Lynda L. McGhie,et al. Health Insurance Portability and Accountability Act (HIPAA) , 2011, Encyclopedia of Information Assurance.
[99] Kyungmin Su,et al. The PREP pipeline: standardized preprocessing for large-scale EEG analysis , 2015, Front. Neuroinform..
[100] Carlos Eduardo Scheidegger,et al. Certifying and Removing Disparate Impact , 2014, KDD.
[101] Paul A. Harris,et al. Secondary use of clinical data: The Vanderbilt approach , 2014, J. Biomed. Informatics.
[102] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.
[103] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[104] Aaron Roth,et al. The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..
[105] Xiaoqian Jiang,et al. DPSynthesizer: Differentially Private Data Synthesizer for Privacy Preserving Data Sharing , 2014, Proc. VLDB Endow..
[106] Yves Rosseel,et al. A Review of fMRI Simulation Studies , 2014, PloS one.
[107] Jun Zhang,et al. PrivBayes: private data release via bayesian networks , 2014, SIGMOD Conference.
[108] Aaron C. Courville,et al. Generative Adversarial Networks , 2014, 1406.2661.
[109] Joydeep Ghosh,et al. Perturbed Gibbs Samplers for Synthetic Data Release , 2013, 1312.5370.
[110] See-Kiong Ng,et al. Integrated Oversampling for Imbalanced Time Series Classification , 2013, IEEE Transactions on Knowledge and Data Engineering.
[111] Joydeep Ghosh,et al. Perturbed Gibbs Samplers for Generating Large-Scale Privacy-Safe Synthetic Health Data , 2013, 2013 IEEE International Conference on Healthcare Informatics.
[112] Kudakwashe Dube,et al. Approach and Method for Generating Realistic Synthetic Electronic Healthcare Records for Secondary Use , 2013, FHIES.
[113] Frank van Harmelen,et al. Knowledge-Based Patient Data Generation , 2013, KR4HC/ProHealth.
[114] Kewei Chen,et al. Identifying effective connectivity parameters in simulated fMRI: a direct comparison of switching linear dynamic system, stochastic dynamic causal, and multivariate autoregressive models , 2013, Front. Neurosci..
[115] Vince D. Calhoun,et al. SimTB, a simulation toolbox for fMRI data under a model of spatiotemporal separability , 2012, NeuroImage.
[116] Abdul V. Roudsari,et al. Computerization of workflows, guidelines, and care pathways: a review of implementation challenges for process-oriented health information systems , 2011, J. Am. Medical Informatics Assoc..
[117] Gholam-Ali Hossein-Zadeh,et al. A mutual information‐based metric for evaluation of fMRI data‐processing approaches , 2011, Human brain mapping.
[118] Karl J. Friston,et al. EEG and MEG Data Analysis in SPM8 , 2011, Comput. Intell. Neurosci..
[119] Anna L. Buczak,et al. Data-driven approach for creating synthetic electronic medical records , 2010, BMC Medical Informatics Decis. Mak..
[120] Jörg Drechsler,et al. Using Support Vector Machines for Generating Synthetic Datasets , 2010, Privacy in Statistical Databases.
[121] Jerome P. Reiter,et al. Random Forests for Generating Partially Synthetic, Categorical Data , 2010, Trans. Data Priv..
[122] Madhav V. Marathe,et al. Generation and analysis of large synthetic social contact networks , 2009, Proceedings of the 2009 Winter Simulation Conference (WSC).
[123] Anna L. Buczak,et al. Construction and Validation of Synthetic Electronic Medical Records , 2009, Online journal of public health informatics.
[124] Vicenç Torra,et al. Generation of synthetic data by means of fuzzy c-Regression , 2009, 2009 IEEE International Conference on Fuzzy Systems.
[125] H. Soltanian-Zadeh,et al. Mutual Information Based Metric for Evaluation of fMRI Data Processing Approaches , 2009, NeuroImage.
[126] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[127] Haibo He,et al. ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).
[128] Ashwin Machanavajjhala,et al. Privacy: Theory meets Practice on the Map , 2008, 2008 IEEE 24th International Conference on Data Engineering.
[129] Kunal Talwar,et al. Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).
[130] Martin A. Lindquist,et al. Modeling state-related fMRI activity using change-point theory , 2007, NeuroImage.
[131] L. Cox. Statistical Disclosure Limitation , 2006 .
[132] Cynthia Dwork,et al. Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.
[133] Hui Han,et al. Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.
[134] Lynda L. McGhie,et al. THE HEALTH INSURANCE PORTABILITY AND ACCOUNTABILITY ACT , 2004 .
[135] Jim Burridge,et al. Information preserving statistical obfuscation , 2003, Stat. Comput..
[136] Luc Capdevila,et al. Document , 2003 .
[137] Leslie G. Ungerleider,et al. Neural Correlates of Visual Working Memory fMRI Amplitude Predicts Task Performance , 2002, Neuron.
[138] Gabriele Lohmann,et al. On Multivariate Spectral Analysis of fMRI Time Series , 2001, NeuroImage.
[139] Rathindra Sarathy,et al. A General Additive Data Perturbation Method for Database Security , 1999 .
[140] S. Hochreiter,et al. Long Short-Term Memory , 1997, Neural Computation.
[141] Huaiyu Zhu. On Information and Sufficiency , 1997 .
[142] C. Walck. Hand-book on statistical distributions for experimentalists , 1996 .
[143] G. Moody,et al. A database to support development and evaluation of intelligent intensive care monitoring , 1996, Computers in Cardiology 1996.
[144] Wai Lam,et al. LEARNING BAYESIAN BELIEF NETWORKS: AN APPROACH BASED ON THE MDL PRINCIPLE , 1994, Comput. Intell..
[145] David J. Spiegelhalter,et al. Local computations with probabilities on graphical structures and their application to expert systems , 1990 .
[146] M R Nuwer,et al. Quantitative EEG: I. Techniques and Problems of Frequency Analysis and Topographic Mapping , 1988, Journal of clinical neurophysiology : official publication of the American Electroencephalographic Society.
[147] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[148] G. Schwarz. Estimating the Dimension of a Model , 1978 .
[149] Michael J. Campbell,et al. Statistics at Square One , 1976, British medical journal.
[150] F. Massey. The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .
[151] Tahsin Kurc,et al. Generating Longitudinal Synthetic EHR Data with Recurrent Autoencoders and Generative Adversarial Networks , 2021, Poly/DMAH@VLDB.
[152] T. Skousen,et al. Electronic Healthcare Record , 2021, Encyclopedia of Gerontology and Population Aging.
[153] Wei Chang,et al. SMOOTH-GAN: Towards Sharp and Smooth Synthetic EHR Data Generation , 2020, AIME.
[154] Mihaela van der Schaar,et al. Time-series Generative Adversarial Networks , 2019, NeurIPS.
[155] Francisco Herrera,et al. SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary , 2018, J. Artif. Intell. Res..
[156] et al.,et al. Missing Data Imputation in the Electronic Health Record Using Deeply Learned Autoencoders , 2017, PSB.
[157] Jesper Wulff,et al. Multiple imputation by chained equations in praxis: Guidelines and review , 2017 .
[158] Paul Voigt,et al. The EU General Data Protection Regulation (GDPR) , 2017 .
[159] Stan Matwin,et al. A Review of Attribute Disclosure Control , 2015, Advanced Research in Data Privacy.
[160] Xiaoqian Jiang,et al. Differentially Private Synthesization of Multi-Dimensional Data using Copula Functions , 2014, EDBT.
[161] Randolph A. Miller,et al. Reducing patient re-identification risk for laboratory results within research datasets , 2013, J. Am. Medical Informatics Assoc..
[162] Michael Lin,et al. Synthetic Data , 2009, Encyclopedia of Database Systems.
[163] S M Smith,et al. Overview of fMRI analysis. , 2004, The British journal of radiology.
[164] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..
[165] R G Mark,et al. MIMIC II: a massive temporal ICU patient database to support research in intelligent patient monitoring , 2002, Computers in Cardiology.
[166] Geoffrey E. Hinton,et al. Stochastic Neighbor Embedding , 2002, NIPS.
[167] J Pardey,et al. A review of parametric modelling techniques for EEG analysis. , 1996, Medical engineering & physics.
[168] L. M. Hobbs. AUTOMATIC GENERATION OF , 1987 .
[169] M. Sklar. Fonctions de repartition a n dimensions et leurs marges , 1959 .
[170] J. Huang,et al. Creating synthetic minority class samples based on autoencoder extreme learning machine , 2022, Pattern Recognit..