Learning Anonymized Representations with Adversarial Neural Networks

Statistical methods protecting sensitive information or the identity of the data owner have become critical to ensure privacy of individuals as well as of organizations. This paper investigates anonymization methods based on representation learning and deep neural networks, and motivated by novel information theoretical bounds. We introduce a novel training objective for simultaneously training a predictor over target variables of interest (the regular labels) while preventing an intermediate representation to be predictive of the private labels. The architecture is based on three sub-networks: one going from input to representation, one from representation to predicted regular labels, and one from representation to predicted private labels. The training procedure aims at learning representations that preserve the relevant part of the information (about regular labels) while dismissing information about the private labels which correspond to the identity of a person. We demonstrate the success of this approach for two distinct classification versus anonymization tasks (handwritten digits and sentiment analysis).

[1]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[2]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[3]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[4]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[5]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[6]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[7]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[8]  L. Wasserman,et al.  A Statistical Framework for Differential Privacy , 2008, 0811.2501.

[9]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[10]  Ashwin Machanavajjhala,et al.  Privacy-Preserving Data Publishing , 2009, Found. Trends Databases.

[11]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[12]  Sheng Zhong,et al.  Privacy-Preserving Backpropagation Neural Network Learning , 2009, IEEE Transactions on Neural Networks.

[13]  Michael J. Lyons,et al.  Evidence and a computational explanation of cultural differences in facial expression recognition. , 2010, Emotion.

[14]  Imre Csiszár,et al.  Information Theory - Coding Theorems for Discrete Memoryless Systems, Second Edition , 2011 .

[15]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[16]  Pablo Piantanida,et al.  Secure Multiterminal Source Coding With Side Information at the Eavesdropper , 2011, IEEE Transactions on Information Theory.

[17]  Martin J. Wainwright,et al.  Privacy Aware Learning , 2012, JACM.

[18]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[19]  Shucheng Yu,et al.  Privacy Preserving Back-Propagation Neural Network Learning Made Practical with Cloud Computing , 2014, IEEE Transactions on Parallel and Distributed Systems.

[20]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[21]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[22]  Stephen E. Fienberg,et al.  Learning with Differential Privacy: Stability, Learnability and the Sufficiency and Necessity of ERM Principle , 2015, J. Mach. Learn. Res..

[23]  Linda G. Shapiro,et al.  Modeling Stylized Character Expressions via Deep Learning , 2016, ACCV.

[24]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.