LinkedIn's Audience Engagements API: A Privacy Preserving Data Analytics System at Scale

We present a privacy system that leverages differential privacy to protect LinkedIn members' data while also providing audience engagement insights to enable marketing analytics related applications. We detail the differentially private algorithms and other privacy safeguards used to provide results that can be used with existing real-time data analytics platforms, specifically with the open sourced Pinot system. Our privacy system provides user-level privacy guarantees. As part of our privacy system, we include a budget management service that enforces a strict differential privacy budget on the returned results to the analyst. This budget management service brings together the latest research in differential privacy into a product to maintain utility given a fixed differential privacy budget.

[1]  Jialiang Li,et al.  Pinot: Realtime OLAP for 530 Million Users , 2018, SIGMOD Conference.

[2]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[3]  Ryan M. Rogers,et al.  Practical Differentially Private Top-k Selection with Pay-what-you-get Composition , 2019, NeurIPS.

[4]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[5]  Dawn Xiaodong Song,et al.  Towards Practical Differential Privacy for SQL Queries , 2017, Proc. VLDB Endow..

[6]  Úlfar Erlingsson,et al.  Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity , 2018, SODA.

[7]  Borja Balle,et al.  The Privacy Blanket of the Shuffle Model , 2019, CRYPTO.

[8]  Janardhan Kulkarni,et al.  Collecting Telemetry Data Privately , 2017, NIPS.

[9]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[10]  Evgeniy Gabrilovich,et al.  Google COVID-19 Community Mobility Reports: Anonymization Process Description (version 1.0) , 2020, ArXiv.

[11]  Ashwin Machanavajjhala,et al.  PrivateSQL: A Differentially Private SQL Query Engine , 2019, Proc. VLDB Endow..

[12]  David Zhang,et al.  On brewing fresh espresso: LinkedIn's distributed data serving platform , 2013, SIGMOD '13.

[13]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[14]  Pramod Viswanath,et al.  The Composition Theorem for Differential Privacy , 2013, IEEE Transactions on Information Theory.

[15]  Marco Gaboardi,et al.  PSI (Ψ): a Private data Sharing Interface , 2016, ArXiv.

[16]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[17]  Salil P. Vadhan,et al.  The Complexity of Computing the Optimal Composition of Differential Privacy , 2015, IACR Cryptol. ePrint Arch..

[18]  Salil P. Vadhan,et al.  The Complexity of Computing the Optimal Composition of Differential Privacy , 2015, TCC.

[19]  Ryan M. Rogers,et al.  Optimal Differential Privacy Composition for Exponential Mechanisms and the Cost of Adaptivity , 2019, ArXiv.

[20]  Krishnaram Kenthapadi,et al.  PriPeARL: A Framework for Privacy-Preserving Analytics and Reporting at LinkedIn , 2018, CIKM.

[21]  William K. C. Lam,et al.  Differentially Private SQL with Bounded User Contribution , 2019, Proc. Priv. Enhancing Technol..

[22]  Adam D. Smith,et al.  Distributed Differential Privacy via Shuffling , 2018, IACR Cryptol. ePrint Arch..