Adaptive Data Analysis

These lecture notes are based on [BNS+16] and were compiled for a guest lecture in the course CS229r “Information Theory in Computer Science” taught by Madhu Sudan at Harvard University in Spring 2016. Menu for today’s lecture: • Motivation • Model • Overfitting & comparison to non-adaptive data analysis • What can we do adaptively? • KL Divergence recap • Proof • Differential privacy (time permitting)

[1]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[2]  Toniann Pitassi,et al.  Preserving Statistical Validity in Adaptive Data Analysis , 2014, STOC.

[3]  Thomas Steinke,et al.  Interactive fingerprinting codes and the hardness of preventing false discovery , 2014, 2016 Information Theory and Applications Workshop (ITA).

[4]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[5]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[6]  Raef Bassily,et al.  Algorithmic stability for adaptive data analysis , 2015, STOC.

[7]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[8]  Toniann Pitassi,et al.  The reusable holdout: Preserving validity in adaptive data analysis , 2015, Science.

[9]  Guy N. Rothblum,et al.  A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[10]  Jonathan Ullman,et al.  Preventing False Discovery in Interactive Data Analysis Is Hard , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[11]  Thomas Steinke,et al.  Between Pure and Approximate Differential Privacy , 2015, J. Priv. Confidentiality.