Context-Aware Generative Adversarial Privacy

Preserving the utility of published datasets while simultaneously providing provable privacy guarantees is a well-known challenge. On the one hand, context-free privacy solutions, such as differential privacy, provide strong privacy guarantees, but often lead to a significant reduction in utility. On the other hand, context-aware privacy solutions, such as information theoretic privacy, achieve an improved privacy-utility tradeoff, but assume that the data holder has access to dataset statistics. We circumvent these limitations by introducing a novel context-aware privacy framework called generative adversarial privacy (GAP). GAP leverages recent advancements in generative adversarial networks (GANs) to allow the data holder to learn privatization schemes from the dataset itself. Under GAP, learning the privacy mechanism is formulated as a constrained minimax game between two players: a privatizer that sanitizes the dataset in a way that limits the risk of inference attacks on the individuals' private variables, and an adversary that tries to infer the private variables from the sanitized dataset. To evaluate GAP's performance, we investigate two simple (yet canonical) statistical dataset models: (a) the binary data model, and (b) the binary Gaussian mixture model. For both models, we derive game-theoretically optimal minimax privacy mechanisms, and show that the privacy mechanisms learned from data (in a generative adversarial fashion) match the theoretically optimal ones. This demonstrates that our framework can be easily applied in practice, even in the absence of dataset statistics.

[1]  William L. Root,et al.  Communications through Unspecified Additive Noise , 1961, Inf. Control..

[2]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[3]  Thomas L. Marzetta,et al.  Detection, Estimation, and Modulation Theory , 1976 .

[4]  J. Morris,et al.  On single-sample robust detection of known signals with additive unknown-mean amplitude-bounded random interference , 1980, IEEE Trans. Inf. Theory.

[5]  Joel M. Morris On single-sample robust detection of known signals with additive unknown-mean amplitude-bounded random interference - II: The randomized decision rule solution , 1981, IEEE Trans. Inf. Theory.

[6]  Hirosuke Yamamoto,et al.  A source coding problem for sources with additional outputs to keep secret from the receiver or wiretappers , 1983, IEEE Trans. Inf. Theory.

[7]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[8]  J. M. Morris,et al.  A random-threshold decision rule for known signals with additive amplitude-bounded nonstationary random interference , 1990, IEEE Trans. Commun..

[9]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[10]  J. Urgen Schmidhuber,et al.  Learning Factorial Codes by Predictability Minimization , 1992, Neural Computation.

[11]  Shlomo Shamai,et al.  Worst-case power-constrained noise for binary-input channels , 1992, IEEE Trans. Inf. Theory.

[12]  Stefen Hui,et al.  On solving constrained optimization problems with neural networks: a penalty method approach , 1993, IEEE Trans. Neural Networks.

[13]  Pierangela Samarati,et al.  Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression , 1998 .

[14]  Guoqiang Peter Zhang,et al.  Neural networks for classification: a survey , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[15]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[16]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[17]  Vijay S. Iyengar,et al.  Transforming data to satisfy privacy constraints , 2002, KDD.

[18]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[19]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[20]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[21]  Philip S. Yu,et al.  Handicapping attacker's confidence: an alternative to k-anonymization , 2006, Knowledge and Information Systems.

[22]  Philip S. Yu,et al.  Anonymizing Classification Data for Privacy Preservation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[23]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[24]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[25]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[26]  Ashwin Machanavajjhala,et al.  Privacy-Preserving Data Publishing , 2009, Found. Trends Databases.

[27]  Peter E. Latham,et al.  Mutual Information , 2006 .

[28]  Philip S. Yu,et al.  Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[29]  Stephen E. Fienberg,et al.  Differential Privacy and the Risk-Utility Tradeoff for Multi-dimensional Contingency Tables , 2010, Privacy in Statistical Databases.

[30]  Josep Domingo-Ferrer,et al.  From t-Closeness-Like Privacy to Postrandomization via Information Theory , 2010, IEEE Transactions on Knowledge and Data Engineering.

[31]  이상헌,et al.  Deep Belief Networks , 2010, Encyclopedia of Machine Learning.

[32]  H. Vincent Poor,et al.  Competitive privacy in the smart grid: An information-theoretic approach , 2011, 2011 IEEE International Conference on Smart Grid Communications (SmartGridComm).

[33]  David P. Varodayan,et al.  Smart meter privacy using a rechargeable battery: Minimizing the rate of information leakage , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[34]  H. Vincent Poor,et al.  Competitive privacy in the smart grid , 2012, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[35]  Flávio du Pin Calmon,et al.  Privacy against statistical inference , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[36]  Jonathan Eckstein Augmented Lagrangian and Alternating Direction Methods for Convex Optimization: A Tutorial and Some Illustrative Computational Results , 2012 .

[37]  Scott Sanner,et al.  Algorithms for Direct 0-1 Loss Optimization in Binary Classification , 2013, ICML.

[38]  Latanya Sweeney,et al.  Identifying Participants in the Personal Genome Project by Name , 2013, ArXiv.

[39]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[40]  Yichuan Tang,et al.  Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[41]  Karsten M. Borgwardt,et al.  Measuring Statistical Dependence via the Mutual Information Dimension , 2013, IJCAI.

[42]  Stephen E. Fienberg,et al.  Privacy-Preserving Data Sharing for Genome-Wide Association Studies , 2012, J. Priv. Confidentiality.

[43]  Martin J. Wainwright,et al.  Local Privacy and Minimax Bounds: Sharp Rates for Probability Estimation , 2013, NIPS.

[44]  H. Vincent Poor,et al.  Smart Meter Privacy: A Theoretical Framework , 2013, IEEE Transactions on Smart Grid.

[45]  Ken R. Duffy,et al.  Bounds on inference , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[46]  H. Vincent Poor,et al.  Utility-Privacy Tradeoffs in Databases: An Information-Theoretic Approach , 2011, IEEE Transactions on Information Forensics and Security.

[47]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[48]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[49]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[50]  Stephen E. Fienberg,et al.  Scalable privacy-preserving data sharing methodology for genome-wide association studies , 2014, J. Biomed. Informatics.

[51]  Muriel Médard,et al.  On information-theoretic metrics for symmetric-key encryption and privacy , 2014, 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[52]  Fady Alajaji,et al.  Notes on information-theoretic privacy , 2014, 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[53]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[54]  Boxiang Dong,et al.  PraDa: Privacy-preserving Data-Deduplication-as-a-Service , 2014, CIKM.

[55]  José Carlos Príncipe,et al.  Rate-Distortion Auto-Encoders , 2013, ICLR.

[56]  Pramod Viswanath,et al.  Extremal Mechanisms for Local Differential Privacy , 2014, J. Mach. Learn. Res..

[57]  Yue Wang,et al.  Differentially Private Hypothesis Testing, Revisited , 2015, ArXiv.

[58]  Fady Alajaji,et al.  On maximal correlation, mutual information and data privacy , 2015, 2015 IEEE 14th Canadian Workshop on Information Theory (CWIT).

[59]  Muriel Médard,et al.  Fundamental limits of perfect privacy , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[60]  Florian Kerschbaum,et al.  Frequency-Hiding Order-Preserving Encryption , 2015, CCS.

[61]  M. Chun,et al.  Functional connectome fingerprinting: Identifying individuals based on patterns of brain connectivity , 2015, Nature Neuroscience.

[62]  Nina Taft,et al.  Managing Your Private and Public Data: Bringing Down Inference Attacks Against Your Privacy , 2014, IEEE Journal of Selected Topics in Signal Processing.

[63]  Fady Alajaji,et al.  Information Extraction Under Privacy Constraints , 2015, Inf..

[64]  Peter Kairouz,et al.  Discrete Distribution Estimation under Local Privacy , 2016, ICML.

[65]  Martin J. Wainwright,et al.  Minimax Optimal Procedures for Locally Private Estimation , 2016, ArXiv.

[66]  Martín Abadi,et al.  Learning to Protect Communications with Adversarial Neural Cryptography , 2016, ArXiv.

[67]  Boxiang Dong,et al.  Secure Data Outsourcing with Adversarial Data Dependency Constraints , 2016, 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS).

[68]  Amos J. Storkey,et al.  Censoring Representations with an Adversary , 2015, ICLR.

[69]  Ye Wang,et al.  On privacy-utility tradeoffs for constrained data release mechanisms , 2016, 2016 Information Theory and Applications Workshop (ITA).

[70]  M. Gerstein,et al.  Quantification of private information leakage from phenotype-genotype data: linking attacks , 2016, Nature Methods.

[71]  Fady Alajaji,et al.  Privacy-aware MMSE estimation , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[72]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[73]  Oliver Kosut,et al.  On the fine asymptotics of information theoretic privacy , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[74]  Sudeep Kamath,et al.  An operational measure of information leakage , 2016, 2016 Annual Conference on Information Science and Systems (CISS).

[75]  Vishesh Karwa,et al.  Inference using noisy degrees: Differentially private $\beta$-model and synthetic graphs , 2012, 1205.4697.

[76]  Jiachun Liao,et al.  A General Framework for Information Leakage , 2017 .

[77]  Paul Voigt,et al.  The EU General Data Protection Regulation (GDPR) , 2017 .

[78]  Jihun Hamm,et al.  Minimax Filter: Learning to Preserve Privacy from Inference Attacks , 2016, J. Mach. Learn. Res..

[79]  Alfred O. Hero,et al.  Ensemble estimation of mutual information , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[80]  Jun Tang,et al.  Privacy Loss in Apple's Implementation of Differential Privacy on MacOS 10.12 , 2017, ArXiv.

[81]  Takayuki Okatani,et al.  Information Potential Auto-Encoders , 2017, ArXiv.

[82]  Ke Xu,et al.  Cleaning the Null Space: A Privacy Mechanism for Predictors , 2017, AAAI.

[83]  Lalitha Sankar,et al.  Privacy-Guaranteed Two-Agent Interactions Using Information-Theoretic Mechanisms , 2017, IEEE Transactions on Information Forensics and Security.

[84]  Oliver Kosut,et al.  On information-theoretic privacy with general distortion cost functions , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[85]  Emiliano De Cristofaro,et al.  LOGAN: Evaluating Privacy Leakage of Generative Models Using Generative Adversarial Networks , 2017, ArXiv.

[86]  Lucas Theis,et al.  Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[87]  Alexander A. Alemi,et al.  Deep Variational Information Bottleneck , 2017, ICLR.

[88]  Fady Alajaji,et al.  Privacy-aware guessing efficiency , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[89]  Ashwin Machanavajjhala,et al.  Protecting Visual Secrets Using Adversarial Nets , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[90]  Oliver Kosut,et al.  Information-Theoretic Privacy with General Distortion Constraints , 2017, ArXiv.

[91]  Aaron B. Wagner,et al.  Operational definitions for some common information leakage metrics , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[92]  Prateek Mittal,et al.  DEEProtect: Enabling Inference-based Access Control on Mobile Sensing Applications , 2017, ArXiv.

[93]  A. Barg,et al.  Optimal Schemes for Discrete Distribution Estimation Under Locally Differential Privacy , 2017, IEEE Transactions on Information Theory.

[94]  Scott Russell,et al.  The EU General Data Protection Regulation (GDPR) , 2018 .

[95]  Fady Alajaji,et al.  Estimation Efficiency Under Privacy Constraints , 2017, IEEE Transactions on Information Theory.