Differentially Private Online Active Learning with Applications to Anomaly Detection

In settings where data instances are generated sequentially or in streaming fashion, online learning algorithms can learn predictors using incremental training algorithms such as stochastic gradient descent. In some security applications such as training anomaly detectors, the data streams may consist of private information or transactions and the output of the learning algorithms may reveal information about the training data. Differential privacy is a framework for quantifying the privacy risk in such settings. This paper proposes two differentially private strategies to mitigate privacy risk when training a classifier for anomaly detection in an online setting. The first is to use a randomized active learning heuristic to screen out uninformative data points in the stream. The second is to use mini-batching to improve classifier performance. Experimental results show how these two strategies can trade off privacy, label complexity, and generalization performance.

[1]  Steve Hanneke Rates of convergence in active learning , 2011, 1103.1790.

[2]  Hua Tang,et al.  Machine Learning-based Intrusion Detection Algorithms , 2009 .

[3]  Bin Li,et al.  A survey on instance selection for active learning , 2012, Knowledge and Information Systems.

[4]  Michael Horstein,et al.  Sequential transmission using noiseless feedback , 1963, IEEE Trans. Inf. Theory.

[5]  Martin J. Wainwright,et al.  Privacy Aware Learning , 2012, JACM.

[6]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[7]  Pravesh Kothari,et al.  25th Annual Conference on Learning Theory Differentially Private Online Learning , 2022 .

[8]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[9]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[10]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[11]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[12]  H. Brendan McMahan,et al.  A survey of Algorithms and Analysis for Adaptive Online Learning , 2014, J. Mach. Learn. Res..

[13]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[14]  Neil D. Lawrence,et al.  Dataset Shift in Machine Learning , 2009 .

[15]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[16]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[17]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[18]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[19]  Sivan Sabato,et al.  Interactive Algorithms: from Pool to Stream , 2016, COLT.

[20]  Anand D. Sarwate,et al.  Learning from Data with Heterogeneous Noise using SGD , 2014, AISTATS.

[21]  Saharon Rosset,et al.  KDD-cup 99: knowledge discovery in a charitable organization's donor database , 2000, SKDD.

[22]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[23]  Anand D. Sarwate,et al.  Stochastic gradient descent with differentially private updates , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[24]  Pramod Viswanath,et al.  The Composition Theorem for Differential Privacy , 2013, IEEE Transactions on Information Theory.

[25]  Maria-Florina Balcan,et al.  Statistical Active Learning Algorithms , 2013, NIPS.

[26]  Moni Naor,et al.  Differential privacy under continual observation , 2010, STOC '10.

[27]  L. Wasserman,et al.  A Statistical Framework for Differential Privacy , 2008, 0811.2501.

[28]  Moni Naor,et al.  Pan-Private Streaming Algorithms , 2010, ICS.

[29]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[30]  Robert D. Nowak,et al.  Minimax Bounds for Active Learning , 2007, IEEE Transactions on Information Theory.

[31]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[32]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[33]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..