Quantitative Models for Privacy Protection

We assume a database consists of records of individuals with private or sensitive fields. Queries on the distribution of a sensitive field within a selected population in the database can be submitted to the data center. The answers to the queries leak private information of individuals though no identification information is provided. Inspired by decision theory, we present two quantitative models for the privacy protection problem in such a database query or linkage environment in this paper. One is for modeling the value of information from the viewpoint of the querier and the other is for the damage and compensation of privacy leakage. In both models, we define the information state by a class of probability distributions on the set of possible confidential values. These states can be modified and refined by the user’s knowledge acquisition actions. In the first model, the value of information is defined as the expected gain of the privacy receiver and the privacy is protected by imposing costs on the answers of the queries for balancing the gain. In the second one, the safety is guaranteed by enforcing that anyone misusing the private information must pay more compensation than the possible gain.