Students and Taxes: a Privacy-Preserving Study Using Secure Computation

Abstract We describe the use of secure multi-party computation for performing a large-scale privacy-preserving statistical study on real government data. In 2015, statisticians from the Estonian Center of Applied Research (CentAR) conducted a big data study to look for correlations between working during university studies and failing to graduate in time. The study was conducted by linking the database of individual tax payments from the Estonian Tax and Customs Board and the database of higher education events from the Ministry of Education and Research. Data collection, preparation and analysis were conducted using the Share-mind secure multi-party computation system that provided end-to-end cryptographic protection to the analysis. Using ten million tax records and half a million education records in the analysis, this is the largest cryptographically private statistical study ever conducted on real data.

[1]  Joan Feigenbaum,et al.  Secure computation of surveys , 2004 .

[2]  Dan Bogdanov,et al.  Domain-Polymorphic Programming of Privacy-Preserving Applications , 2014, PLAS@ECOOP.

[3]  Dan Bogdanov,et al.  Deploying Secure Multi-Party Computation for Financial Data Analysis - (Short Paper) , 2012, Financial Cryptography.

[4]  Murat Kantarcioglu,et al.  A Protocol for the Secure Linking of Registries for HPV Surveillance , 2012, PloS one.

[5]  Koji Chida,et al.  Implementation and evaluation of an efficient secure computation system using 'R' for healthcare statistics. , 2014, Journal of the American Medical Informatics Association : JAMIA.

[6]  Dan Bogdanov,et al.  Rmind: A Tool for Cryptographically Secure Statistical Analysis , 2016, IEEE Transactions on Dependable and Secure Computing.

[7]  Riivo Talviste,et al.  From Oblivious AES to Efficient and Secure Database Join in the Multiparty Setting , 2013, ACNS.

[8]  Dan Bogdanov Sharemind: programmable secure computations with practical applications , 2013 .

[9]  Liina Kamm,et al.  Privacy-preserving statistical analysis using secure multi-party computation , 2015 .

[10]  Ivan Damgård,et al.  Confidential Benchmarking Based on Multiparty Computation , 2016, Financial Cryptography.

[11]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[12]  Ivan Damgård,et al.  Secure Multiparty Computation Goes Live , 2009, Financial Cryptography.