Towards a respondent-preferred ki-anonymity model

Recently, privacy concerns about data collection have received an increasing amount of attention. In data collection process, a data collector (an agency) assumed that all respondents would be comfortable with submitting their data if the published data was anonymous. We believe that this assumption is not realistic because the increase in privacy concerns causes some respondents to refuse participation or to submit inaccurate data to such agencies. If respondents submit inaccurate data, then the usefulness of the results from analysis of the collected data cannot be guaranteed. Furthermore, we note that the level of anonymity (i.e., k-anonymity) guaranteed by an agency cannot be verified by respondents since they generally do not have access to all of the data that is released. Therefore, we introduce the notion of ki-anonymity, where ki. is the level of anonymity preferred by each respondent i. Instead of placing full trust in an agency, our solution increases respondent confidence by allowing each to decide the preferred level of protection. As such, our protocol ensures that respondents achieve their preferred ki-anonymity during data collection and guarantees that the collected records are genuine and useful for data analysis.

[1]  Josep Domingo-Ferrer,et al.  Coprivacy: Towards a Theory of Sustainable Privacy , 2010, Privacy in Statistical Databases.

[2]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[3]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[4]  A. V. Sriharsha,et al.  On Syntactic Anonymity and Differential Privacy , 2015 .

[5]  G. Bebis,et al.  An Analysis of Anonymity Technology Usage , 2011 .

[6]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[7]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[8]  Josep Domingo-Ferrer,et al.  Co-utility: Self-enforcing protocols without coordination mechanisms , 2015, 2015 International Conference on Industrial Engineering and Operations Management (IEOM).

[9]  Nick Mathewson,et al.  Tor: The Second-Generation Onion Router , 2004, USENIX Security Symposium.

[10]  H D Crombie On privacy. , 2001, Connecticut medicine.

[11]  Kok-Seng Wong,et al.  Privacy-Preserving Data Collection with Self-Awareness Protection , 2014, FCC.

[12]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[13]  J. Ferrer,et al.  Coprivacy: an introduction to the theory and applications of co-operative privacy , 2011 .

[14]  Raymond Chi-Wing Wong,et al.  (alpha, k)-anonymity Based Privacy Preservation by Lossy Join , 2007, APWeb/WAIM.

[15]  Myung Ho Kim,et al.  Towards Self-Awareness Privacy Protection for Internet of Things Data Collection , 2014, J. Appl. Math..

[16]  David Chaum,et al.  Untraceable electronic mail, return addresses, and digital pseudonyms , 1981, CACM.

[17]  Giampaolo Bella,et al.  Enforcing privacy in e-commerce by balancing anonymity and trust , 2011, Comput. Secur..

[18]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[19]  L Sweeney,et al.  Weaving Technology and Policy Together to Maintain Confidentiality , 1997, Journal of Law, Medicine & Ethics.

[20]  Clay Shirky,et al.  Collecting and sharing data for population health: a new paradigm. , 2009, Health affairs.

[21]  Wenliang Du,et al.  Using randomized response techniques for privacy-preserving data mining , 2003, KDD '03.

[22]  David Chaum,et al.  Untraceable electronic mail, return addresses, and digital pseudonyms , 1981, CACM.

[23]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[24]  Chris Clifton,et al.  On syntactic anonymity and differential privacy , 2013, 2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW).

[25]  Raymond Chi-Wing Wong,et al.  Minimality Attack in Privacy Preserving Data Publishing , 2007, VLDB.

[26]  Wei Zhao,et al.  A new scheme on privacy-preserving data classification , 2005, KDD '05.

[27]  Robert Garfinkel,et al.  Freedom of Privacy: Anonymous Data Collection with Respondent-Defined Privacy Protection , 2010, INFORMS J. Comput..

[28]  Ramakrishnan Srikant,et al.  Privacy-preserving data mining , 2000, SIGMOD '00.

[29]  Stefano Bistarelli,et al.  Retaliation: Can We Live with Flaws? , 2005 .

[30]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[31]  Bülent Yener,et al.  On anonymity in an electronic society: A survey of anonymous communication systems , 2009, CSUR.

[32]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[33]  Qi Wang,et al.  On the privacy preserving properties of random data perturbation techniques , 2003, Third IEEE International Conference on Data Mining.