A Conceptual Framework for Assessing Anonymization-Utility Trade-Offs Based on Principal Component Analysis

An anonymization technique for databases is proposed that employs Principal Component Analysis. The technique aims at releasing the least possible amount of information, while preserving the utility of the data released in response to queries. The general scheme is described, and alternative metrics are proposed to assess utility, based respectively on matrix norms; correlation coefficients; divergence measures, and quality indices of database images. This approach allows to properly measure the utility of output data and incorporate that measure in the anonymization method.

[1]  Jing Wang,et al.  Independent component analysis-based dimensionality reduction with applications in hyperspectral image analysis , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[2]  Qian Du,et al.  Hyperspectral Image Compression Using JPEG2000 and Principal Component Analysis , 2007, IEEE Geoscience and Remote Sensing Letters.

[3]  Josep Domingo-Ferrer,et al.  A Methodology to Compare Anonymization Methods Regarding Their Risk-Utility Trade-off , 2017, MDAI.

[4]  Michael K. Reiter,et al.  Statistical Privacy for Streaming Traffic , 2019, NDSS.

[5]  Li Zhang,et al.  Analyze gauss: optimal bounds for privacy-preserving principal component analysis , 2014, STOC.

[6]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[7]  Ned S Wingreen,et al.  Flexibility of β‐sheets: Principal component analysis of database protein structures , 2004, Proteins.

[8]  METHODS FOR SUBJECTIVE DETERMINATION OF TRANSMISSION QUALITY Summary , 2022 .

[9]  Stefan Winkler,et al.  Mean opinion score (MOS) revisited: methods and applications, limitations and alternatives , 2016, Multimedia Systems.

[10]  Kwanghoon Sohn,et al.  Principal component analysis for compression of hyperspectral images , 2001, IGARSS 2001. Scanning the Present and Resolving the Future. Proceedings. IEEE 2001 International Geoscience and Remote Sensing Symposium (Cat. No.01CH37217).

[11]  Paul Laskowski,et al.  Epsilon Voting: Mechanism Design for Parameter Selection in Differential Privacy , 2018, 2018 IEEE Symposium on Privacy-Aware Computing (PAC).

[12]  Giuseppe D'Acquisto,et al.  Mr X vs. Mr Y: The Emergence of Externalities in Differential Privacy , 2017, APF.

[13]  Giuseppe D'Acquisto,et al.  Differential privacy for counting queries: can Bayes estimation help uncover the true value? , 2014, ArXiv.

[14]  Margaret H. Pinson,et al.  Comparing subjective video quality testing methodologies , 2003, Visual Communications and Image Processing.

[15]  A.R. Runnalls,et al.  A Kullback-Leibler Approach to Gaussian Mixture Reduction , 2007 .

[16]  Josep Domingo-Ferrer,et al.  Database Anonymization: Privacy Models, Data Utility, and Microaggregation-based Inter-model Connections , 2016, Database Anonymization.

[17]  Robert H. Shumway,et al.  Discrimination and Clustering for Multivariate Time Series , 1998 .

[18]  Djemel Ziou,et al.  Image Quality Metrics: PSNR vs. SSIM , 2010, 2010 20th International Conference on Pattern Recognition.

[19]  Giuseppe D'Acquisto,et al.  Hiding Alice in Wonderland: A Case for the Use of Signal Processing Techniques in Differential Privacy , 2018, APF.

[20]  Zhou Wang,et al.  Why is image quality assessment so difficult? , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[21]  King Ngi Ngan,et al.  Image Retargeting Quality Assessment: A Study of Subjective Scores and Objective Metrics , 2012, IEEE Journal of Selected Topics in Signal Processing.

[22]  D. Munson A note on Lena , 1996 .

[23]  Robert R. Meglen,et al.  Examining large databases: A chemometric approach using principal component analysis , 1991 .

[24]  W. Beyer CRC Standard Mathematical Tables and Formulae , 1991 .

[25]  Sugato Chakravarty,et al.  Methodology for the subjective assessment of the quality of television pictures , 1995 .

[26]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[27]  Giuseppe D'Acquisto,et al.  Differential Privacy: An Estimation Theory-Based Method for Choosing Epsilon , 2015, ArXiv.

[28]  Daniel Zwillinger,et al.  CRC standard mathematical tables and formulae; 30th edition , 1995 .