An Enhanced Data Perturbation Approach for Small Data Sets

As modern organizations gather, analyze, and share large quantities of data, issues of privacy, and confidentiality are becoming increasingly important. Perturbation methods are used to protect confidentiality when confidential, numerical data are shared or disseminated for analysis. Unfortunately, existing perturbation methods are not suitable for protecting small data sets. With small data sets, existing perturbation methods result in reduced protection against disclosure risk due to sampling error. Sampling error may also produce different results from the analysis of perturbed data compared to the original data, reducing data utility. In this study, we develop an enhancement of an existing perturbation technique, General Additive Data Perturbation, that can be used to effectively mask both large and small data sets. The proposed enhancement minimizes the risk of disclosure while ensuring that the results of commonly performed statistical analyses are identical and equal for both the original and the perturbed data.