Alleviating Privacy Attacks via Causal Learning

Machine learning models, especially deep neural networks have been shown to be susceptible to privacy attacks such as membership inference where an adversary can detect whether a data point was used for training a black-box model. Such privacy risks are exacerbated when a model's predictions are used on an unseen data distribution. To alleviate privacy attacks, we demonstrate the benefit of predictive models that are based on the causal relationships between input features and the outcome. We first show that models learnt using causal structure generalize better to unseen data, especially on data from different distributions than the train distribution. Based on this generalization property, we establish a theoretical link between causality and privacy: compared to associational models, causal models provide stronger differential privacy guarantees and are more robust to membership inference attacks. Experiments on simulated Bayesian networks and the colored-MNIST dataset show that associational models exhibit upto 80% attack accuracy under different test distributions and sample sizes whereas causal models exhibit attack accuracy close to a random guess.

[1]  Reza Shokri,et al.  Comprehensive Privacy Analysis of Deep Learning: Stand-alone and Federated Learning under Passive and Active White-box Inference Attacks , 2018, ArXiv.

[2]  Ilya Shpitser,et al.  Fair Inference on Outcomes , 2017, AAAI.

[3]  Emiliano De Cristofaro,et al.  : Membership Inference Attacks Against Generative Models , 2018 .

[4]  Christopher Joseph Pal,et al.  A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms , 2019, ICLR.

[5]  André Elisseeff,et al.  Using Markov Blankets for Causal Structure Learning , 2008, J. Mach. Learn. Res..

[6]  Thomas Fischer,et al.  Deep learning with long short-term memory networks for financial market predictions , 2017, Eur. J. Oper. Res..

[7]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[8]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[9]  Matt J. Kusner,et al.  Private Causal Inference , 2015, AISTATS.

[10]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[11]  Robert Laganière,et al.  Membership Inference Attack against Differentially Private Deep Learning Model , 2018, Trans. Data Priv..

[12]  Mario Fritz,et al.  ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models , 2018, NDSS.

[13]  Vitaly Shmatikov,et al.  The Natural Auditor: How To Tell If Someone Used Your Words To Train Their Model , 2018, ArXiv.

[14]  David Lopez-Paz,et al.  Invariant Risk Minimization , 2019, ArXiv.

[15]  Marco Scutari,et al.  Learning Bayesian Networks with the bnlearn R Package , 2009, 0908.3817.

[16]  Shruti Tople,et al.  Domain Generalization using Causal Matching , 2020, ICML.

[17]  Matt J. Kusner,et al.  Inferring the Causal Direction Privately , 2015 .

[18]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation , 2010, J. Mach. Learn. Res..

[19]  Emiliano De Cristofaro,et al.  LOGAN: Membership Inference Attacks Against Generative Models , 2017, Proc. Priv. Enhancing Technol..

[20]  Bernhard Schölkopf,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[21]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[22]  Yair Zick,et al.  Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[23]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[24]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[25]  Yishay Mansour,et al.  Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.

[26]  Jeffrey F. Naughton,et al.  Revisiting Differentially Private Regression: Lessons From Learning Theory and their Consequences , 2015, ArXiv.

[27]  Vitaly Shmatikov,et al.  Auditing Data Provenance in Text-Generation Models , 2018, KDD.

[28]  Amir Houmansadr,et al.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[29]  Úlfar Erlingsson,et al.  The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets , 2018, ArXiv.

[30]  Alexandros Iosifidis,et al.  Using deep learning to detect price change indications in financial markets , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[31]  Reza Shokri,et al.  Machine Learning with Membership Privacy using Adversarial Regularization , 2018, CCS.

[32]  Jonas Peters,et al.  Causal inference by using invariant prediction: identification and confidence intervals , 2015, 1501.01332.

[33]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[34]  Andre Esteva,et al.  A guide to deep learning in healthcare , 2019, Nature Medicine.

[35]  Mikhail Belkin,et al.  Learning privately from multiparty data , 2016, ICML.

[36]  Úlfar Erlingsson,et al.  The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks , 2018, USENIX Security Symposium.

[37]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..