Speeding Up Secure Computations via Embedded Caching

Most existing work on Privacy-Preserving Data Mining (PPDM) focus on enabling conventional data mining algorithms with the ability to run in a secure manner in a multi-party setting. Although various algorithms in data mining have been enhanced to incorporate secure mechanisms for data privacy preservation, their computation performance is far too high to allow them to be practically useful. This is especially true for those algorithms that make use of common cryptosystems. In this paper, we address the efficiency issue of PPDM algorithms by proposing to cache result data that are used more than once by secure computations. For this to be possible, we carefully examine the micro steps of secure computations to identify the repetitive or iterative portions and reduce the overall computational cost by caching intermediate results/data. We have applied this to decision tree induction, association rule mining and k-means clustering that make use of secure building blocks such as secure multi-party sum, secure matrix multiplication, and secure inverse of matrix sum. We show empirically that the computational costs of secure computations can be reduced without affecting the quality of the data mining result in general. Our experiments show that the caching technique is generalizable to common data mining algorithms and the efficiency of PPDM algorithms can be greatly improved without compromising data privacy.