Assessing the effectiveness of the noise addition method of preserving confidentiality in the multivariate normal case

Abstract A common method of preserving the confidentiality of subjects of statistical studies consists of adding zero mean random variables to the attributes of individual records in released microdata files. If this method is to be effective, however, it is necessary to have a meaningful measure of the degree to which the privacy of the individual is protected. Two measures of confidentiality are proposed. These quantities measure the ability of a devious user to correctly identify an individual. Bounds on these measures are found in the case of a sample from a multivariate normal population to which multivariate normal noise vectors have been added; it is assumed that the covariance structure of the noise vectors is the same as that of the original population. From these bounds, it is concluded that a particular person's record can be easily identified if that individual's attributes are far from the populations mean vector or if the number of attributes is large.