Distributed Semi-Private Image Classification Based on Information-Bottleneck Principle

In this paper, we propose a framework for semi-privacy-preserving image classification. It allows each user to train a model on her/his own particular data class, after which the output features are shared centrally. The model parameters are never shared. Individual users each use an auto-encoder to empirically ascertain their private data distribution. The resulting features are sufficiently discriminative between the private datasets. A central server aggregates all labeled output features together with a subset of the private data into a final classifier over all classes from all users. The latter forms a trade-off between privacy and classification performance. We demonstrate the viability of this scheme empirically and showcase the privacy performance compromise.

[1]  H. Vincent Poor,et al.  Utility-Privacy Tradeoffs in Databases: An Information-Theoretic Approach , 2011, IEEE Transactions on Information Forensics and Security.

[2]  Ashwin Machanavajjhala,et al.  A rigorous and customizable framework for privacy , 2012, PODS.

[3]  Shideh Rezaeifar,et al.  Information bottleneck through variational glasses , 2019, ArXiv.

[4]  Jhosimar Arias Figueroa Semi-supervised Learning using Deep Generative Models and Auxiliary Tasks , 2019 .

[5]  Ken R. Duffy,et al.  Bounds on inference , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[6]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[7]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[8]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[10]  Oliver Kosut,et al.  On information-theoretic privacy with general distortion cost functions , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[11]  Ram Rajagopal,et al.  Generative Adversarial Privacy: A Data-Driven Approach to Information-Theoretic Privacy , 2018, 2018 52nd Asilomar Conference on Signals, Systems, and Computers.

[12]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[13]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[14]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[15]  Flávio du Pin Calmon,et al.  Privacy against statistical inference , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[16]  Shin Ishii,et al.  Distributional Smoothing with Virtual Adversarial Training , 2015, ICLR 2016.

[17]  Muriel Médard,et al.  Fundamental limits of perfect privacy , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[18]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[19]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[20]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[21]  Ye Wang,et al.  Privacy-Preserving Adversarial Networks , 2017, 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[22]  Jelena Stajic,et al.  One model to rule them all , 2016 .

[23]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[24]  Giuseppe Ateniese,et al.  Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.

[25]  Fady Alajaji,et al.  Estimation Efficiency Under Privacy Constraints , 2017, IEEE Transactions on Information Theory.

[26]  Wen-Chuan Lee,et al.  Trojaning Attack on Neural Networks , 2018, NDSS.

[27]  Ali Makhdoumi,et al.  Privacy-utility tradeoff under statistical uncertainty , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[28]  Ye Wang,et al.  On privacy-utility tradeoffs for constrained data release mechanisms , 2016, 2016 Information Theory and Applications Workshop (ITA).

[29]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[30]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[31]  Ram Rajagopal,et al.  Context-Aware Generative Adversarial Privacy , 2017, Entropy.

[32]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[33]  Fady Alajaji,et al.  Notes on information-theoretic privacy , 2014, 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[34]  Nina Taft,et al.  Managing Your Private and Public Data: Bringing Down Inference Attacks Against Your Privacy , 2014, IEEE Journal of Selected Topics in Signal Processing.