PUTWorkbench: Analysing Privacy in AI-intensive Systems

AI intensive systems that operate upon user data face the challenge of balancing data utility with privacy concerns. We propose the idea and present the prototype of an open-source tool called Privacy Utility Trade-off (PUT) Workbench which seeks to aid software practitioners to take such crucial decisions. We pick a simple privacy model that doesn't require any background knowledge in Data Science and show how even that can achieve significant results over standard and real-life datasets. The tool and the source code is made freely available for extensions and usage.

[1]  Athos Antoniades,et al.  Privacy preserving data publishing of categorical data through k-anonymity and feature selection. , 2016, Healthcare technology letters.

[2]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[3]  Xintao Wu,et al.  DPWeka: Achieving Differential Privacy in WEKA , 2017, 2017 IEEE Symposium on Privacy-Aware Computing (PAC).

[4]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[5]  Yun Li,et al.  Differentially private feature selection , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[6]  Tobias J. Oechtering,et al.  Privacy-Aware Distributed Bayesian Detection , 2015, IEEE Journal of Selected Topics in Signal Processing.

[7]  Murat Kantarcioglu,et al.  Optimizing secure classification performance with privacy-aware feature selection , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[8]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[9]  Anand D. Sarwate,et al.  Signal Processing and Machine Learning with Differential Privacy: Algorithms and Challenges for Continuous Data , 2013, IEEE Signal Processing Magazine.

[10]  Adam D. Smith,et al.  Differentially Private Feature Selection via Stability Arguments, and the Robustness of the Lasso , 2013, COLT.

[11]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[12]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[13]  Reid A. Johnson,et al.  Calibrating Probability with Undersampling for Unbalanced Classification , 2015, 2015 IEEE Symposium Series on Computational Intelligence.

[14]  Fabio Martinelli,et al.  Privacy-Utility Feature Selection as a Privacy Mechanism in Collaborative Data Classification , 2017, 2017 IEEE 26th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE).

[15]  Latanya Sweeney,et al.  Datafly: A System for Providing Anonymity in Medical Data , 1997, DBSec.

[16]  Yin Yang,et al.  Functional Mechanism: Regression Analysis under Differential Privacy , 2012, Proc. VLDB Endow..

[17]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[18]  Md Zahidul Islam,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Differentially Private Random Decision Forests Using Smooth Sensitivity , 2022 .

[19]  Elaine Shi,et al.  GUPT: privacy preserving data analysis made easy , 2012, SIGMOD Conference.

[20]  Pierangela Samarati,et al.  Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression , 1998 .

[21]  Vitaly Shmatikov,et al.  Airavat: Security and Privacy for MapReduce , 2010, NSDI.

[22]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[23]  Vijay S. Iyengar,et al.  Transforming data to satisfy privacy constraints , 2002, KDD.

[24]  Yücel Saygin,et al.  Differentially private nearest neighbor classification , 2017, Data Mining and Knowledge Discovery.

[25]  Ling Huang,et al.  Learning in a Large Function Space: Privacy-Preserving Mechanisms for SVM Learning , 2009, J. Priv. Confidentiality.

[26]  Murat Kantarcioglu,et al.  Privacy-aware dynamic feature selection , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[27]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[28]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[29]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.