论文信息 - k-Anonymity: A Model for Protecting Privacy

k-Anonymity: A Model for Protecting Privacy

Consider a data holder, such as a hospital or a bank, that has a privately held collection of person-specific, field structured data. Suppose the data holder wants to share a version of the data with researchers. How can a data holder release a version of its private data with scientific guarantees that the individuals who are the subjects of the data cannot be re-identified while the data remain practically useful? The solution provided in this paper includes a formal protection model named k-anonymity and a set of accompanying policies for deployment. A release provides k-anonymity protection if the information for each person contained in the release cannot be distinguished from at least k-1 individuals whose information also appears in the release. This paper also examines re-identification attacks that can be realized on releases that adhere to k- anonymity unless accompanying policies are respected. The k-anonymity protection model is important because it forms the basis on which the real-world systems known as Datafly, µ-Argus and k-Similar provide guarantees of privacy protection.

Latanya Sweeney | L. Sweeney

[1] Ivan P. Fellegi,et al. On the Question of Statistical Confidentiality , 1972 .

[2] Peter J. Denning,et al. The tracker: a threat to statistical database security , 1979, TODS.

[3] Dorothy E. Denning,et al. Cryptography and Data Security , 1982 .

[4] Matthew Morgenstern,et al. Security and inference in multilevel database and knowledge-base systems , 1987, SIGMOD '87.

[5] Dorothy E. Denning,et al. A Multilevel Relational Data Model , 1987, 1987 IEEE Symposium on Security and Privacy.

[6] Jeffrey D. Ullman,et al. Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[7] C. Goodman. National Association of Health Data Organizations , 1988 .

[8] Thomas H. Hinke,et al. Inference aggregation detection in database management systems , 1988, Proceedings. 1988 IEEE Symposium on Security and Privacy.

[9] R. Fildes. Journal of the American Statistical Association : William S. Cleveland, Marylyn E. McGill and Robert McGill, The shape parameter for a two variable graph 83 (1988) 289-300 , 1989 .

[10] Teresa F. Lunt. Aggregation and inference: facts and fallacies , 1989, Proceedings. 1989 IEEE Symposium on Security and Privacy.

[11] Jeffrey D. Uuman. Principles of database and knowledge- base systems , 1989 .

[12] George T. Duncan,et al. Enhancing Access to Microdata while Protecting Confidentiality: Prospects for the Future , 1991 .

[13] George T. Duncan,et al. Microdata disclosure limitation in statistical databases: query size and random sample query control , 1991, Proceedings. 1991 IEEE Computer Society Symposium on Research in Security and Privacy.

[14] Gultekin Özsoyoglu,et al. Controlling FD and MVD Inferences in Multilevel Relational Database Systems , 1991, IEEE Trans. Knowl. Data Eng..

[15] Mark E. Stickel,et al. Abductive and approximate reasoning models for characterizing inference channels , 1991, Proceedings Computer Security Foundations Workshop IV.

[16] Peter D. Karp,et al. Detection and elimination of inference channels in multilevel relational database systems , 1993, Proceedings 1993 IEEE Computer Society Symposium on Research in Security and Privacy.

[17] Ton de Waal,et al. Statistical Disclosure Control in Practice , 1996 .

[18] Latanya Sweeney,et al. Guaranteeing anonymity when sharing medical data, the Datafly System , 1997, AMIA.

[19] L. Sweeney. Towards the optimal suppression of details when disclosing medical data, the use of sub-combination analysis , 1998 .

[20] M. Palley,et al. Regression Methodology Based Disclosure of a Statistical Database , 2002 .

[21] H. Cram. '5 I" , 2022 .