论文信息 - Data Privacy and Utility Trade-Off Based on Mutual Information Neural Estimator

Data Privacy and Utility Trade-Off Based on Mutual Information Neural Estimator

In the era of big data and the Internet of Things (IoT), data owners need to share a large amount of data with the intended receivers in an insecure environment, posing a tradeoff issue between user privacy and data utility. The privacy utility trade-off was facilitated through a privacy funnel based on mutual information. Nevertheless, it is challenging to characterize the mutual information accurately with small sample size or unknown distribution functions. In this article, we propose a privacy funnel based on mutual information neural estimator (MINE) to optimize the privacy utility trade-off by estimating mutual information. Instead of computing mutual information in traditional way, we estimate it using an MINE, which obtains the estimated mutual information in a trained way, ensuring that the estimation results are as precise as possible. We employ estimated mutual information as a measure of privacy and utility, and then form a problem to optimize data utility by training a neural network while the estimator’s privacy discourse is less than a threshold. The simulation results also demonstrated that the estimated mutual information from MINE works very well to approximate the mutual information even with a limited number of samples to quantify privacy leakage and data utility retention, as well as optimize the privacy utility trade-off.

[1] Fraser,et al. Independent coordinates for strange attractors from mutual information. , 1986, Physical review. A, General physics.

[2] Yochai Blau,et al. Direct Validation of the Information Bottleneck Principle for Deep Nets , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[3] Latanya Sweeney,et al. k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[4] Mohamed Ali Moussa,et al. Privacy Preserving Utility-Aware Mechanism for Data Uploading Phase in Participatory Sensing , 2019, IEEE Transactions on Mobile Computing.

[5] Cynthia Dwork,et al. Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[6] S. Varadhan,et al. Asymptotic evaluation of certain Markov process expectations for large time , 1975 .

[7] Yoshua Bengio,et al. Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[8] Ninghui Li,et al. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[9] Domingo-FerrerJosep,et al. Enhancing data utility in differential privacy via microaggregation-based k-anonymity , 2014, VLDB 2014.

[10] Josep Domingo-Ferrer,et al. Database Anonymization: Privacy Models, Data Utility, and Microaggregation-based Inter-model Connections , 2016, Database Anonymization.

[11] Flávio du Pin Calmon,et al. Privacy against statistical inference , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[12] Pierangela Samarati,et al. Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[13] Gerhard Wunder,et al. Deep Learning for Channel Coding via Neural Mutual Information Estimation , 2019, 2019 IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[14] Takayuki Okatani,et al. Information Potential Auto-Encoders , 2017, ArXiv.

[15] Ke Xu,et al. Cleaning the Null Space: A Privacy Mechanism for Predictors , 2017, AAAI.

[16] Aaron C. Courville,et al. MINE: Mutual Information Neural Estimation , 2018, ArXiv.

[17] A. Kraskov,et al. Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18] Yehuda Lindell,et al. Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[19] Igor Vajda,et al. Estimation of the Information by an Adaptive Partitioning of the Observation Space , 1999, IEEE Trans. Inf. Theory.

[20] Cynthia Dwork,et al. Differential Privacy , 2006, ICALP.

[21] Pramod Viswanath,et al. Demystifying fixed k-nearest neighbor information estimators , 2016, 2017 IEEE International Symposium on Information Theory (ISIT).

[22] Muriel Médard,et al. From the Information Bottleneck to the Privacy Funnel , 2014, 2014 IEEE Information Theory Workshop (ITW 2014).

[23] B. Rinner,et al. Privacy Protection vs . Utility in Visual Data An Objective Evaluation Framework , 2017 .

[24] M. Adams,et al. Big Data and Individual Privacy in the Age of the Internet of Things , 2017 .

[25] Weizhu Qian,et al. Learning Robust Variational Information Bottleneck with Reference , 2021, ArXiv.

[26] Takafumi Kanamori,et al. Approximating Mutual Information by Maximum Likelihood Density Ratio Estimation , 2008, FSDM.

[27] ASHWIN MACHANAVAJJHALA,et al. L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[28] Hirosuke Yamamoto,et al. A source coding problem for sources with additional outputs to keep secret from the receiver or wiretappers , 1983, IEEE Trans. Inf. Theory.

[29] David Barber,et al. The IM algorithm: a variational approach to Information Maximization , 2003, NIPS 2003.