A comparison of synthetic data approaches using utility and disclosure risk measures

[1]  Hang J. Kim,et al.  Synthetic microdata for establishment surveys under informative sampling , 2020, Journal of the Royal Statistical Society: Series A (Statistics in Society).

[2]  Khaled El Emam,et al.  Evaluating Identity Disclosure Risk in Fully Synthetic Health Data: Model Development and Validation , 2020, Journal of medical Internet research.

[3]  Andreas Ekelhart,et al.  A Baseline for Attribute Disclosure Risk in Synthetic Data , 2020, CODASPY.

[4]  Daniel Bernau,et al.  Monte Carlo and Reconstruction Membership Inference Attacks against Generative Models , 2019, Proc. Priv. Enhancing Technol..

[5]  Gillian M. Raab,et al.  synthpop: Bespoke Creation of Synthetic Data in R , 2016 .

[6]  Joshua Snoke,et al.  General and specific utility measures for synthetic data , 2016, 1604.06651.

[7]  B. Ripley Classification and Regression Trees , 2015 .

[8]  Jerome P. Reiter,et al.  Multiple Imputation of Missing Categorical and Continuous Values via Bayesian Mixture Models With Local Dependence , 2014, 1410.0438.

[9]  Yonghee Lee Review on statistical methods for protecting privacy and measuring risk of disclosure when releasing information for public use , 2013 .

[10]  Anna Oganian,et al.  Global Measures of Data Utility for Microdata Masked for Disclosure Limitation , 2009, J. Priv. Confidentiality.

[11]  Harry J. Khamis,et al.  Measures of Association: How to Choose? , 2008 .

[12]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[13]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[14]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[15]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .