Security-control methods for statistical databases: a comparative study

This paper considers the problem of providing security to statistical databases against disclosure of confidential information. Security-control methods suggested in the literature are classified into four general approaches: conceptual, query restriction, data perturbation, and output perturbation. Criteria for evaluating the performance of the various security-control methods are identified. Security-control methods that are based on each of the four approaches are discussed, together with their performance with respect to the identified evaluation criteria. A detailed comparative analysis of the most promising methods for protecting dynamic-online statistical databases is also presented. To date no single security-control method prevents both exact and partial disclosures. There are, however, a few perturbation-based methods that prevent exact disclosure and enable the database administrator to exercise "statistical disclosure control." Some of these methods, however introduce bias into query responses or suffer from the 0/1 query-set-size problem (i.e., partial disclosure is possible in case of null query set or a query set of size 1). We recommend directing future research efforts toward developing new methods that prevent exact disclosure and provide statistical-disclosure control, while at the same time do not suffer from the bias problem and the 0/1 query-set-size problem. Furthermore, efforts directed toward developing a bias-correction mechanism and solving the general problem of small query-set-size would help salvage a few of the current perturbation-based methods.

[1]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[2]  D. Horvitz,et al.  A Multi-Proportions Randomized Response Model , 1967 .

[3]  W. R. Simmons,et al.  The Unrelated Question Randomized Response Model: Theoretical Framework , 1969 .

[4]  S. Warner The Linear Randomized Response Model , 1971 .

[5]  D. Horvitz,et al.  Application of the Randomized Response Technique in Obtaining Quantitative Data , 1971 .

[6]  Rein Turn,et al.  Privacy and security in databank systems: measures of effectiveness, costs, and protector-intruder interactions , 1972, AFIPS '72 (Fall, part I).

[7]  Ivan P. Fellegi,et al.  On the Question of Statistical Confidentiality , 1972 .

[8]  A. Miller The assault on privacy : computers, data banks, and dossiers , 1972 .

[9]  Lance J. Hoffman,et al.  Modern methods for computer security and privacy , 1973 .

[10]  I. P. Fellegi,et al.  Statistical Confidentiality: Some Theory and Application to Data Dissemination , 1974 .

[11]  J. Schlörer Identification and Retrieval of Personal Records from a Statistical Data Bank , 1975, Methods of Information in Medicine.

[12]  Mohammed Inam Ul Haq,et al.  Insuring individual's privacy from statistical data base users , 1975, AFIPS '75.

[13]  J. Schlörer Confidentiality of Statistical Records: A Threat-Monitoring Scheme for On Line Dialogue , 1976, Methods of Information in Medicine.

[14]  Mohammad Inam ul Haq,et al.  On Safeguarding Statistical Disclosure by Giving Approximate Answers to Queries , 1977, International Computing Symposium.

[15]  Clement T. Yu,et al.  A study on the protection of statistical data bases , 1977, SIGMOD '77.

[16]  Jeffrey D. Ullman,et al.  A model of statistical database their security , 1977, TODS.

[17]  共立出版株式会社 コンピュータ・サイエンス : ACM computing surveys , 1978 .

[18]  Francis Y. L. Chin Security in statistical databases for queries with small counts , 1978, TODS.

[19]  Peter J. Denning,et al.  Linear queries in statistical databases , 1979, ACM Trans. Database Syst..

[20]  Peter J. Denning,et al.  The tracker: a threat to statistical database security , 1979, TODS.

[21]  James O. Achugbue,et al.  The Effectiveness Of Output Modification By Rounding For Protection Of Statistical Data Bases , 1979 .

[22]  P. Y. Chin,et al.  Security is partitioned dynamic stastical databases , 1979, COMPSAC.

[23]  Richard J. Lipton,et al.  Secure databases: protection against user influence , 1979, TODS.

[24]  Theodore D. Friedman,et al.  Towards a Fail-Safe Approach to Secure Databases , 1980, 1980 IEEE Symposium on Security and Privacy.

[25]  Steven P. Reiss Practical Data-Swapping: The First Steps , 1980, 1980 IEEE Symposium on Security and Privacy.

[26]  Dorothy E. Denning,et al.  A fast procedure for finding a tracker in a statistical database , 1980, TODS.

[27]  Leland L. Beck,et al.  A security machanism for statistical database , 1980, TODS.

[28]  L. Cox Suppression Methodology and Statistical Disclosure Control , 1980 .

[29]  Dorothy E. Denning,et al.  Secure statistical databases with random sample queries , 1980, TODS.

[30]  Jan Schlörer Disclosure from Statistical Databases: Quantitative Aspects of Trackers , 1980, ACM Trans. Database Syst..

[31]  Jan Schlörer,et al.  Security of statistical databases: multidimensional transformation , 1980, TODS.

[32]  Gultekin Özsoyoglu,et al.  Statistical database design , 1981, TODS.

[33]  Gultekin Özsoyoglu,et al.  Update Handling Techniques in Statistical Databases , 1981, SSDBM.

[34]  Dorothy E. Denning,et al.  Restriciting Queries that Might Lead to Compromise , 1981, 1981 IEEE Symposium on Security and Privacy.

[35]  Z. Meral Ozsoyoglu,et al.  Update handling techniques in statistical databases , 1981 .

[36]  Dorothy E. Denning,et al.  Memoryless Inference Controls for Statistical Databases , 1982, 1982 IEEE Symposium on Security and Privacy.

[37]  Dorothy E. Denning,et al.  Cryptography and Data Security , 1982 .

[38]  Gultekin Özsoyoglu,et al.  Enhancing the Security of Statistical Databases with a Question-Answering System and a Kernel Design , 1982, IEEE Transactions on Software Engineering.

[39]  Gultekin Özsoyoglu,et al.  Auditing and Inference Control in Statistical Databases , 1982, IEEE Transactions on Software Engineering.

[40]  Ernst L. Leiss Randomizing, A Practical Method for Protecting Statistical Databases Against Compromise , 1982, VLDB.

[41]  Gordon Sande,et al.  Automated Cell Suppression to Preserve Confidentiality of Business Statistics , 1983, SSDBM.

[42]  Dorothy E. Denning,et al.  A Security Model for the Statistical Database Problem , 1983, SSDBM.

[43]  Mary McLeish An Information Theoretic Approach to Statistical Databases and Their Security: A Preliminary Report , 1983, SSDBM.

[44]  Ezio Lefons,et al.  An Analytic Approach to Statistical Databases , 1983, VLDB.

[45]  Dorothy E. Denning,et al.  Inference Controls for Statistical Databases , 1983, Computer.

[46]  Jan Schlörer,et al.  Information Loss in Partitioned Statistical Databases , 1983, Comput. J..

[47]  Wiebren de Jonge,et al.  Compromising statistical databases responding to queries about means , 1983, TODS.

[48]  Henryk Wozniakowski,et al.  The statistical security of a statistical database , 1984, TODS.

[49]  Dorothy E. Denning Cryptographic Checksums for Multilevel Database Security , 1984, 1984 IEEE Symposium on Security and Privacy.

[50]  An application of statistical databases in manufacturing testing , 1985, 1984 IEEE First International Conference on Data Engineering.

[51]  Francis Y. L. Chin,et al.  Efficient Inference Control for Range SUM Queries , 1984, Theor. Comput. Sci..

[52]  Neil C. Rowe Diophantine inferences from statistical aggregates on few-valued attributes , 1984, 1984 IEEE First International Conference on Data Engineering.

[53]  Chong K. Liew,et al.  A data distortion by probability distribution , 1985, TODS.

[54]  Dorothy E. Denning,et al.  Commutative Filters for Reducing Inference Threats in Multilevel Database Systems , 1985, 1985 IEEE Symposium on Security and Privacy.

[55]  Gultekin Özsoyoglu,et al.  Rounding and Inference Controlin Conceptual Models for Statistical Databases , 1985, 1985 IEEE Symposium on Security and Privacy.

[56]  Michael A. Palley Security of statistical databases compromise through attribute correlational modeling , 1986, 1986 IEEE Second International Conference on Data Engineering.

[57]  R. Lesuisse Modern methods for COBOL programmers: By J.R. Pugh and D.H. Bell, Prentice-Hall, Englewood Cliffs, NJ, 1983 , 1986 .

[58]  Gultekin Özsoyoglu,et al.  Information loss in the lattice model of summary tables due to cell suppression , 1986, 1986 IEEE Second International Conference on Data Engineering.

[59]  Sakti P. Ghosh Statistical relational tables for statistical database management , 1986, IEEE Transactions on Software Engineering.

[60]  Norman S. Matloff Another Look at the Use of Noise Addition for Database Security , 1986, 1986 IEEE Symposium on Security and Privacy.

[61]  Gultekin Özsoyoglu,et al.  Data Dependencies and Inference Control in Multilevel Relational Database Systems , 1987, 1987 IEEE Symposium on Security and Privacy.

[62]  Matthew Morgenstern,et al.  Security and inference in multilevel database and knowledge-base systems , 1987, SIGMOD '87.

[63]  Jeffrey S. Simonoff,et al.  The use of regression methodology for the compromise of confidential information in statistical databases , 1987, TODS.