Using Deep Learning to Generate Relational HoneyData

Although there has been a plethora of work in generating deceptive applications, generating deceptive data that can easily fool attackers received very little attention. In this book chapter, we discuss our secure deceptive data generation framework that makes it hard for an attacker to distinguish between the real versus deceptive data. Especially, we discuss how to generate such deceptive data using deep learning and differential privacy techniques. In addition, we discuss our formal evaluation framework.

[1]  S. James Press,et al.  Univariate and Multivariate Log-Linear and Logistic Models , 1973 .

[2]  Basit Shafiq,et al.  Differentially Private Naive Bayes Classification , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[3]  Max Welling,et al.  Practical Privacy For Expectation Maximization , 2016, ArXiv.

[4]  Carl A. Gunter,et al.  Plausible Deniability for Privacy-Preserving Data Synthesis , 2017, Proc. VLDB Endow..

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[7]  Kamalika Chaudhuri,et al.  Privacy-preserving logistic regression , 2008, NIPS.

[8]  Emiliano De Cristofaro,et al.  Differentially Private Mixture of Generative Neural Networks , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[9]  Cynthia Dwork,et al.  Differential privacy and robust statistics , 2009, STOC '09.

[10]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[11]  Anand D. Sarwate,et al.  Stochastic gradient descent with differentially private updates , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[12]  Jun Zhang,et al.  PrivBayes: private data release via bayesian networks , 2014, SIGMOD Conference.

[13]  Pierre Baldi,et al.  Autoencoders, Unsupervised Learning, and Deep Architectures , 2011, ICML Unsupervised and Transfer Learning.

[14]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[15]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[16]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[17]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[18]  Ronald L. Rivest,et al.  Honeywords: making password-cracking detectable , 2013, CCS.

[19]  Bhavani M. Thuraisingham,et al.  Privacy Preserving Synthetic Data Release Using Deep Learning , 2018, ECML/PKDD.

[20]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[21]  J. Yuill,et al.  Honeyfiles: deceptive files for intrusion detection , 2004, Proceedings from the Fifth Annual IEEE SMC Information Assurance Workshop, 2004..

[22]  L. Spitzner,et al.  Honeypots: Tracking Hackers , 2002 .

[23]  Mohammed H. Almeshekah,et al.  Cyber Security Deception , 2016, Cyber Deception.

[24]  Yin Yang,et al.  Functional Mechanism: Regression Analysis under Differential Privacy , 2012, Proc. VLDB Endow..

[25]  Thomas Steinke,et al.  Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds , 2016, TCC.

[26]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.