Privacy-Preserving Randomized Controlled Trials: A Protocol for Industry Scale Deployment

Randomized Controlled Trials, when feasible, give the strongest and most trustworthy empirical measures of causal effects. They are the gold standard in many clinical, social, and behavioral fields of study. However, the most important settings often involve the most sensitive data, therefore cause privacy concerns. In this paper, we outline a way to deploy an end-to-end privacy-preserving protocol for learning causal effects from Randomized Controlled Trials (RCTs). We are particularly focused on the difficult and important case where one party determines which treatment an individual receives, and another party measures outcomes on individuals, and these parties do not want to leak any of their information to each other, but still want to collectively learn a true causal effect in the world. Moreover, we show how such a protocol can be scaled to 500 million rows of data and more than a billion gates. We also offer an open source deployment of this protocol. We accomplish this by a three-stage solution, interconnecting and blending three privacy technologies--private set intersection, multiparty computation, and differential privacy--to address core points of privacy leakage, at the join, at the point of computation, and at the release, respectively. The first stage uses the Private-ID protocol[8] to create a private encrypted join of the users. The second stage utilizes the encrypted join to run multiple instances of a general purpose MPC over a sharded database to aggregate statistics about each experimental group while discarding individuals who took an action before they received treatment. The third stage adds distributed and calibrated Differential Privacy (DP) noise within the final MPC computations to the released aggregate statistical estimates of causal effects and their uncertainty measures, providing formal two-sided privacy guarantees. We also evaluate the performance of multiple open source general purpose MPC libraries for this task. We additionally demonstrate how we have used this to create a working ads effectiveness measurement product capable of measuring hundreds of millions of individuals per experiment.

[1]  Abhradeep Thakurta,et al.  Statistically Valid Inferences from Privacy-Protected Data , 2023, American Political Science Review.

[2]  C. Dwork,et al.  Exposed! A Survey of Attacks on Private Data , 2017, Annual Review of Statistics and Its Application.

[3]  David Evans,et al.  Obliv-C: A Language for Extensible Data-Oblivious Computation , 2015, IACR Cryptol. ePrint Arch..

[4]  C. Mulrow,et al.  Current methods of the US Preventive Services Task Force: a review of the process. , 2001, American journal of preventive medicine.

[5]  Andrew Bray,et al.  Differentially Private Confidence Intervals , 2020, ArXiv.

[6]  Claudio Orlandi,et al.  A New Approach to Practical Active-Secure Two-Party Computation , 2012, IACR Cryptol. ePrint Arch..

[7]  Tal Malkin,et al.  Garbling Gadgets for Boolean and Arithmetic Circuits , 2016, IACR Cryptol. ePrint Arch..

[8]  Yihua Zhang,et al.  PICCO: a general-purpose compiler for private distributed computation , 2013, CCS.

[9]  James Honaker,et al.  Bootstrap Inference and Differential Privacy: Standard Errors for Free∗ , 2018 .

[10]  Benny Pinkas,et al.  Efficient Circuit-based PSI with Linear Communication , 2019, IACR Cryptol. ePrint Arch..

[11]  Toniann Pitassi,et al.  The Limits of Two-Party Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[12]  Vito D'Orazio,et al.  Differential Privacy for Social Science Inference , 2015 .

[13]  Patrick Traynor,et al.  Frigate: A Validated, Extensible, and Efficient Compiler and Interpreter for Secure Computation , 2016, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[14]  Ivan Damgård,et al.  Multiparty Computation from Somewhat Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[15]  Muhammad Khurram Khan,et al.  Recent advancements in garbled computing: How far have we come towards achieving secure, efficient and reusable garbled circuits , 2018, J. Netw. Comput. Appl..

[16]  James Honaker,et al.  Unbiased Statistical Estimation and Valid Confidence Intervals Under Differential Privacy , 2021, Statistica Sinica.

[17]  Benny Pinkas,et al.  Efficient Circuit-based PSI via Cuckoo Hashing , 2018, IACR Cryptol. ePrint Arch..

[18]  Brett Hemenway,et al.  SoK: General Purpose Compilers for Secure Multi-Party Computation , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[19]  Benny Pinkas,et al.  Phasing: Private Set Intersection Using Permutation-based Hashing , 2015, USENIX Security Symposium.

[20]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[21]  Alex J. Malozemoff,et al.  Faster Secure Two-Party Computation in the Single-Execution Setting , 2017, EUROCRYPT.

[22]  Jonathan Katz,et al.  Private Set Intersection: Are Garbled Circuits Better than Custom Protocols? , 2012, NDSS.

[23]  Payman Mohassel,et al.  Private Matching for Compute , 2020, IACR Cryptol. ePrint Arch..

[24]  Aleksandra B. Slavkovic,et al.  Differentially Private Inference for Binomial Data , 2019, J. Priv. Confidentiality.

[25]  Vishesh Karwa,et al.  Finite Sample Differentially Private Confidence Intervals , 2017, ITCS.

[26]  Helmut Veith,et al.  Secure two-party computations in ANSI C , 2012, CCS.

[27]  Aseem Rastogi,et al.  EzPC: Programmable, Efficient, and Scalable Secure Two-Party Computation , 2018, IACR Cryptol. ePrint Arch..

[28]  Jonathan Katz,et al.  Authenticated Garbling and Efficient Maliciously Secure Two-Party Computation , 2017, CCS.

[29]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[30]  Thomas Steinke,et al.  Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds , 2016, TCC.

[31]  Ivan Damgård,et al.  Semi-Homomorphic Encryption and Multiparty Computation , 2011, IACR Cryptol. ePrint Arch..

[32]  Amos Beimel,et al.  Secret-Sharing Schemes: A Survey , 2011, IWCC.

[33]  Michael Zohner,et al.  ABY - A Framework for Efficient Mixed-Protocol Secure Two-Party Computation , 2015, NDSS.

[34]  Moti Yung,et al.  On Deploying Secure Computing: Private Intersection-Sum-with-Cardinality , 2020, 2020 IEEE European Symposium on Security and Privacy (EuroS&P).

[35]  Emiliano De Cristofaro,et al.  Fast and Private Computation of Cardinality of Set Intersection and Union , 2012, CANS.

[36]  Peter Scholl,et al.  Low Cost Constant Round MPC Combining BMR and Oblivious Transfer , 2017, Journal of Cryptology.

[37]  Daniel Sheldon,et al.  General-Purpose Differentially-Private Confidence Intervals , 2020, ArXiv.