A widely used method for confidentiality protection in statistical databases is to add zero mean noise to sensitive attribute values. Most studies assume that the attributes are normally distributed. Using an exponential random variable as an example, this article investigates the effect of additive noise data masking for attributes with skewed distributions. Examples of exponentially distributed sensitive attributes used for statistical analysis include the time between testing HIV positive and the manifestation of symptoms for AIDS and the time between consecutive arrests for repeat offenders. We analyze the issues of data quality and confidentiality protection. Our results indicate that skewed attributes are, in some sense, better protected than normally distributed attributes under additive noise data masking.
[1]
Norman S. Matloff.
Another Look at the Use of Noise Addition for Database Security
,
1986,
1986 IEEE Symposium on Security and Privacy.
[2]
George T. Duncan,et al.
Enhancing Access to Microdata while Protecting Confidentiality: Prospects for the Future
,
1991
.
[3]
Nabil R. Adam,et al.
Security-control methods for statistical databases: a comparative study
,
1989,
ACM Comput. Surv..
[4]
George T. Duncan,et al.
Disclosure-Limited Data Dissemination
,
1986
.
[5]
P. Tendick.
Optimal noise addition for preserving confidentiality in multivariate data
,
1991
.