Adversarial Privacy Preservation under Attribute Inference Attack

With the prevalence of machine learning services, crowdsourced data containing sensitive information poses substantial privacy challenges. Existing work focusing on protecting against membership inference attacks under the rigorous framework of differential privacy are vulnerable to attribute inference attacks. In light of the current gap between theory and practice, we develop a novel theoretical framework for privacy-preservation under the attack of attribute inference. Under our framework, we propose a minimax optimization formulation to protect the given attribute and analyze its privacy guarantees against arbitrary adversaries. On the other hand, it is clear that privacy constraint may cripple utility when the protected attribute is correlated with the target variable. To this end, we also prove an information-theoretic lower bound to precisely characterize the fundamental trade-off between utility and privacy. Empirically, we extensively conduct experiments to corroborate our privacy guarantee and validate the inherent trade-offs in different privacy preservation algorithms. Our experimental results indicate that the adversarial representation learning approaches achieve the best trade-off in terms of privacy preservation and utility maximization.

[1]  Zhe Zhao,et al.  Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations , 2017, ArXiv.

[2]  Yuan Tian,et al.  Privacy Partitioning: Protecting User Data During the Deep Learning Inference Phase , 2018, ArXiv.

[3]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[4]  Kristian Lum,et al.  An algorithm for removing sensitive information: Application to race-independent recidivism prediction , 2017, The Annals of Applied Statistics.

[5]  Li Fei-Fei,et al.  Faster CryptoNets: Leveraging Sparsity for Real-World Encrypted Inference , 2018, ArXiv.

[6]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[7]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[8]  Timothy Baldwin,et al.  Towards Robust and Privacy-preserving Text Representations , 2018, ACL.

[9]  Toniann Pitassi,et al.  Learning Adversarially Fair and Transferable Representations , 2018, ICML.

[10]  Amos J. Storkey,et al.  Censoring Representations with an Adversary , 2015, ICLR.

[11]  Dawn Xiaodong Song,et al.  Delving into Transferable Adversarial Examples and Black-box Attacks , 2016, ICLR.

[12]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[13]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[14]  Graham Cormode,et al.  Personal privacy vs population privacy: learning to attack anonymization , 2011, KDD.

[15]  Zhiwei Steven Wu,et al.  Privacy-Preserving Distributed Deep Learning for Clinical Data , 2018, ArXiv.

[16]  Yang Song,et al.  Age Progression/Regression by Conditional Adversarial Autoencoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[18]  Gary Anthes,et al.  Data brokers are watching you , 2014, Commun. ACM.

[19]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[20]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[21]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[22]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[23]  Ram Rajagopal,et al.  Context-Aware Generative Adversarial Privacy , 2017, Entropy.

[24]  Shari Lawrence Pfleeger,et al.  Going Spear Phishing: Exploring Embedded Training and Awareness , 2014, IEEE Security & Privacy.

[25]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[26]  Kristian Lum,et al.  A statistical framework for fair predictive algorithms , 2016, ArXiv.

[27]  José M. F. Moura,et al.  Adversarial Multiple Source Domain Adaptation , 2018, NeurIPS.

[28]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[29]  Serge Egelman,et al.  It's No Secret. Measuring the Security and Reliability of Authentication via “Secret” Questions , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[30]  Kun Zhang,et al.  On Learning Invariant Representation for Domain Adaptation , 2019, ArXiv.

[31]  Dominik Endres,et al.  A new metric for probability distributions , 2003, IEEE Transactions on Information Theory.

[32]  Han Zhao,et al.  Inherent Tradeoffs in Learning Fair Representations , 2019, NeurIPS.

[33]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[34]  Shai Ben-David,et al.  Empirical Risk Minimization under Fairness Constraints , 2018, NeurIPS.

[35]  Miriam A. M. Capretz,et al.  MLaaS: Machine Learning as a Service , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[36]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[37]  Cynthia Dwork,et al.  Privacy-Preserving Datamining on Vertically Partitioned Databases , 2004, CRYPTO.

[38]  Michael Naehrig,et al.  CryptoNets: applying neural networks to encrypted data with high throughput and accuracy , 2016, ICML 2016.

[39]  Zhenyu Wu,et al.  Towards Privacy-Preserving Visual Recognition via Adversarial Training: A Pilot Study , 2018, ECCV.

[40]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.