Cardinality-Based Inference Control in Sum-Only Data Cubes

This paper addresses the inference problems in data warehouses and decision support systems such as on-line analytical processing (OLAP) systems. Even though OLAP systems restrict user accesses to predefined aggregations, inappropriate disclosure of sensitive attribute values may still occur. Based on a definition of non-compromiseability to mean that any member of a set of variables satisfying a given set of their aggregations can have more than one value, we derive sufficient conditions for non-compromiseability in sum-only data cubes. Under this definition, (1) the non-compromiseability of multi-dimensional aggregations can be reduced to that of one dimensional aggregations, (2) full or dense core cuboids are non-compromiseable, and (3) there is a tight lower bound for the cardinality of a core cuboid to remain non-compromiseable. Based on these results, taken together with a three-tier model for controlling inferences, we provide a divide-and-conquer algorithm that uniformly divides data sets into chunks and builds a data cube on each such chunk. The union of these data cubes are then used to provide users with inference-free OLAP queries.

[1]  Richard J. Lipton,et al.  Secure databases: protection against user influence , 1979, TODS.

[2]  Dorothy E. Denning,et al.  Inference Controls for Statistical Databases , 1983, Computer.

[3]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[4]  Sushil Jajodia,et al.  Secure Databases: Constraints, Inference Channels, and Monitoring Disclosures , 2000, IEEE Trans. Knowl. Data Eng..

[5]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[6]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, CSUR.

[7]  Leland L. Beck,et al.  A security machanism for statistical database , 1980, TODS.

[8]  P. Y. Chin,et al.  Security is partitioned dynamic stastical databases , 1979, COMPSAC.

[9]  Sushil Jajodia,et al.  Auditing Interval-Based Inference , 2002, CAiSE.

[10]  Jeffrey F. Naughton,et al.  Caching multidimensional queries using chunks , 1998, SIGMOD '98.

[11]  Ivan P. Fellegi,et al.  On the Question of Statistical Confidentiality , 1972 .

[12]  K. D. Ikramov Sparse matrices , 2020, Krylov Subspace Methods with Application in Incompressible Fluid Flow Solvers.

[13]  Marina Moscarini,et al.  Computational issues connected with the protection of sensitive statistics by auditing sum-queries , 1998, Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243).

[14]  Duminda Wijesekera,et al.  Constraints, Inference Channels and Secure Databases , 2000, CP.

[15]  Gultekin Özsoyoglu,et al.  Statistical database design , 1981, TODS.

[16]  Henryk Wozniakowski,et al.  The statistical security of a statistical database , 1984, TODS.

[17]  Gultekin Özsoyoglu,et al.  Auditing and Inference Control in Statistical Databases , 1982, IEEE Transactions on Software Engineering.

[18]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[19]  Jon M. Kleinberg,et al.  Auditing Boolean attributes , 2000, PODS.

[20]  L. Cox Suppression Methodology and Statistical Disclosure Control , 1980 .

[21]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[22]  David Alan Hanson,et al.  Data security , 1979, ACM-SE 17.

[23]  Francis Y. L. Chin,et al.  Efficient Inference Control for Range SUM Queries , 1984, Theor. Comput. Sci..

[24]  Dorothy E. Denning,et al.  Secure statistical databases with random sample queries , 1980, TODS.

[25]  Jan Schlörer,et al.  Security of statistical databases: multidimensional transformation , 1980, TODS.

[26]  W. Greub Linear Algebra , 1981 .

[27]  Xintao Wu,et al.  Using approximations to scale exploratory data analysis in datacubes , 1999, KDD '99.

[28]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.

[29]  Jayant R. Haritsa,et al.  Maintaining Data Privacy in Association Rule Mining , 2002, VLDB.