An empirical evaluation of easily implemented, nonparametric methods for generating synthetic datasets
暂无分享,去创建一个
[1] Wei-Yin Loh,et al. Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..
[2] Jerome P. Reiter,et al. Multiple imputation for missing data via sequential regression trees. , 2010, American journal of epidemiology.
[3] Jerome P. Reiter. Significance tests for multi-component estimands from multiply imputed, synthetic microdata , 2005 .
[4] Jerome P. Reiter,et al. The Multiple Adaptations of Multiple Imputation , 2007 .
[5] Julia Lane,et al. Measuring the Impact of Data Protection Techniques on Data Utility: Evidence from the Survey of Consumer Finances , 2006, Privacy in Statistical Databases.
[6] Jörg Drechsler,et al. Comparing Fully and Partially Synthetic Datasets for Statistical Disclosure Control in the German IAB Establishment Panel , 2008, Trans. Data Priv..
[7] Raul Cano. On The Bayesian Bootstrap , 1992 .
[8] Jörg Drechsler,et al. Accounting for Intruder Uncertainty Due to Sampling When Estimating Identification Disclosure Risks in Partially Synthetic Data , 2008, Privacy in Statistical Databases.
[9] Jerome P. Reiter,et al. Making public use , synthetic files of the Longitudinal Business Database , 2022 .
[10] Javier M. Moguerza,et al. Support Vector Machines with Applications , 2006, math/0612817.
[11] Keying Ye,et al. Applied Bayesian Modeling and Causal Inference From Incomplete-Data Perspectives , 2005, Technometrics.
[12] L. Sweeney. Computational Disclosure Control for Medical Microdata , 1997 .
[13] J. Gerring. A case study , 2011, Technology and Society.
[14] Gary Benedetto,et al. Distribution-Preserving Statistical Disclosure Limitation , 2007, Comput. Stat. Data Anal..
[15] Fang Liu,et al. Statistical Disclosure Techniques Based on Multiple Imputation , 2005 .
[16] Jerome P. Reiter,et al. Adjusting Survey Weights When Altering Identifying Design Variables Via Synthetic Data , 2006, Privacy in Statistical Databases.
[17] Jerome P. Reiter,et al. Random Forests for Generating Partially Synthetic, Categorical Data , 2010, Trans. Data Priv..
[18] Andrew Gelman,et al. Applied Bayesian Modeling And Causal Inference From Incomplete-Data Perspectives , 2005 .
[19] Jerome P. Reiter,et al. Sampling With Synthesis: A New Approach for Releasing Public Use Census Microdata , 2010 .
[20] P. Doyle,et al. Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies , 2001 .
[21] Jerome P. Reiter,et al. Releasing multiply imputed, synthetic public use microdata: an illustration and empirical study , 2005 .
[22] Jerome P. Reiter,et al. Using CART to generate partially synthetic public use microdata , 2005 .
[23] Insuk Sohn,et al. Selecting marker genes for cancer classification using supervised weighted kernel clustering and the support vector machine , 2009, Comput. Stat. Data Anal..
[24] L. Willenborg,et al. Elements of Statistical Disclosure Control , 2000 .
[25] Hosik Choi,et al. Gene selection and prediction for cancer classification using support vector machines with a reject option , 2011, Comput. Stat. Data Anal..
[26] Chih-Jen Lin,et al. A Practical Guide to Support Vector Classication , 2008 .
[27] Thomas Zwick,et al. A new approach for disclosure control in the IAB establishment panel—multiple imputation for a better data access , 2008 .
[28] Richard Penny,et al. Multiply Imputed Synthetic Data Files , 2007 .
[29] Giuseppe Porro,et al. Missing data imputation, matching and other applications of random recursive partitioning , 2007, Comput. Stat. Data Anal..
[30] Jerome P. Reiter. Estimating Risks of Identification Disclosure in Microdata , 2005 .
[31] John M. Abowd,et al. Final Report to the Social Security Administration on the SIPP/SSA/IRS Public Use File Project , 2006 .
[32] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[33] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[34] Natalie Shlomo,et al. Assessing Identification Risk in Survey Microdata Using Log-Linear Models , 2008 .
[35] John M. Abowd,et al. Multiply-Imputing Confidential Characteristics and File Links in Longitudinal Linked Data , 2004, Privacy in Statistical Databases.
[36] Chih-Jen Lin,et al. Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..
[37] W. Winkler. Examples of Easy-to-implement, Widely Used Methods of Masking for which Analytic Properties are not Justified , 2008 .
[38] Jerome P. Reiter,et al. Estimating Risks of Identification Disclosure in Partially Synthetic Data , 2009, J. Priv. Confidentiality.
[39] A. Kennickell. Multiple Imputation and Disclosure Protection : TheCase of the 1995 Survey of Consumer Finances , 2000 .
[40] Bernhard Schölkopf,et al. A tutorial on support vector regression , 2004, Stat. Comput..
[41] Joerg Drechsler,et al. New data dissemination approaches in old Europe – synthetic datasets for a German establishment survey , 2012 .
[42] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.
[43] Jörg Drechsler,et al. Using Support Vector Machines for Generating Synthetic Datasets , 2010, Privacy in Statistical Databases.
[44] D. Pager,et al. Estimating Risk , 2010, Social psychology quarterly.
[45] John Van Hoewyk,et al. A multivariate technique for multiply imputing missing values using a sequence of regression models , 2001 .
[46] M. Elliot,et al. A Case Study of the Impact of Statistical Disclosure Control on Data Quality in the Individual UK Samples of Anonymised Records , 2007 .
[47] Simon D. Woodcock,et al. Disclosure Limitation in Longitudinal Linked Data , 2002 .