Privacy Protection for Flexible Parametric Survival Models

Data privacy is a major concern in modern society. In this work, we propose two solutions to the privacy-preserving problem of regression models on medical data. We focus on flexible parametric models which are powerful alternatives to the well-known Cox regression model. For the first approach, we propose a sampling mechanism which guarantees differential privacy for flexible parametric survival models. We first transform the likelihood function of the models to guarantee that likelihood values are bounded. We then use a Hamiltonian Monte-Carlo sampler to sample a random parameter vector from the posterior distribution. As a result, this random vector satisfies the requirement for differential privacy. For the second approach, as predictions with high accuracy and high confidence are very important for medical applications, we propose a mechanism which protects privacy by randomly perturbing the posterior distribution. We can then use the sampler to draw multiple random samples of the perturbed posterior to estimate the credible intervals of the parameters. The proposed mechanism does not guarantee differential privacy for the perturbed posterior. However, it allows controlling the contribution of each individual data record to the posterior. In the worst case scenario, when all data records are revealed except the target data record, the random noise added to the posterior would make it extremely difficult to obtain the target data record. The experiments conducted on two real datasets show that our proposed approaches outperform state-of-the-art methods in predicting the survival rate of individuals.

[1]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[2]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[3]  Aarti Singh,et al.  Differentially private subspace clustering , 2015, NIPS.

[4]  Irit Dinur,et al.  Revealing information while preserving privacy , 2003, PODS.

[5]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[6]  Laura Wichert,et al.  Simple non-parametric estimators for unemployment duration analysis , 2007 .

[7]  Benjamin C. M. Fung,et al.  Anonymizing healthcare data: a case study on the blood transfusion service , 2009, KDD.

[8]  Glenn Fung,et al.  Privacy-preserving cox regression for survival analysis , 2008, KDD.

[9]  Damien McAullay,et al.  Confidentialising Survival Analysis Output in a Remote Data Access System , 2012, J. Priv. Confidentiality.

[10]  Terry M Therneau,et al.  Prevalence of monoclonal gammopathy of undetermined significance. , 2006, The New England journal of medicine.

[11]  Peeter Laud,et al.  Combining Differential Privacy and Secure Multiparty Computation , 2015, ACSAC.

[12]  Jun Tang,et al.  Privacy Loss in Apple's Implementation of Differential Privacy on MacOS 10.12 , 2017, ArXiv.

[13]  Cynthia Dwork,et al.  The Differential Privacy Frontier (Extended Abstract) , 2009, TCC.

[14]  Cynthia Dwork The Differential Privacy Frontier , 2009 .

[15]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[16]  Divesh Srivastava,et al.  Composing Differential Privacy and Secure Computation: A Case Study on Scaling Private Record Linkage , 2017, CCS.

[17]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[18]  Ming-Syan Chen,et al.  On the Design and Analysis of the Privacy-Preserving SVM Classifier , 2011, IEEE Transactions on Knowledge and Data Engineering.

[19]  Glenn Fung,et al.  Privacy-Preserving Predictive Models for Lung Cancer Survival Analysis , 2008 .

[20]  T. Therneau,et al.  Use of nonclonal serum immunoglobulin free light chains to predict overall survival in the general population. , 2012, Mayo Clinic proceedings.

[21]  Franklin Dexter,et al.  The Risks to Patient Privacy from Publishing Data from Clinical Anesthesia Studies , 2016, Anesthesia and analgesia.

[22]  Stephen T. Joy The Differential Privacy of Bayesian Inference , 2015 .

[23]  Alexander J. Smola,et al.  Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo , 2015, ICML.

[24]  D.,et al.  Regression Models and Life-Tables , 2022 .

[25]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[26]  Sheng Zhong,et al.  Privacy-preserving models for comparing survival curves using the logrank test , 2011, Comput. Methods Programs Biomed..

[27]  J. Ramsay Monotone Regression Splines in Action , 1988 .

[28]  Yee Whye Teh,et al.  Relativistic Monte Carlo , 2016, AISTATS.

[29]  Ersin Uzun,et al.  Achieving Differential Privacy in Secure Multiparty Data Aggregation Protocols on Star Networks , 2017, CODASPY.

[30]  Dejing Dou,et al.  Differential Privacy Preservation for Deep Auto-Encoders: an Application of Human Behavior Prediction , 2016, AAAI.

[31]  Kamalika Chaudhuri,et al.  Renyi Differential Privacy Mechanisms for Posterior Sampling , 2017, NIPS.

[32]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[33]  P. Royston,et al.  Flexible parametric proportional‐hazards and proportional‐odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects , 2002, Statistics in medicine.

[34]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[35]  Siu Cheung Hui,et al.  Differentially Private Regression for Discrete-Time Survival Analysis , 2017, CIKM.