Learning Realistic Patterns from Visually Unrealistic Stimuli: Generalization and Data Anonymization

Good training data is a prerequisite to develop useful Machine Learning applications. However, in many domains existing data sets cannot be shared due to privacy regulations (e.g., from medical studies). This work investigates a simple yet unconventional approach for anonymized data synthesis to enable third parties to benefit from such anonymized data. We explore the feasibility of learning implicitly from visually unrealistic, task-relevant stimuli, which are synthesized by exciting the neurons of a trained deep neural network. As such, neuronal excitation can be used to generate synthetic stimuli. The stimuli data is used to train new classification models. Furthermore, we extend this framework to inhibit representations that are associated with specific individuals. We use sleep monitoring data from both an open and a large closed clinical study, and Electroencephalogram sleep stage classification data, to evaluate whether (1) end-users can create and successfully use customized classification models, and (2) the identity of participants in the study is protected. Extensive comparative empirical investigation shows that different algorithms trained on the stimuli are able to generalize successfully on the same task as the original model. Architectural and algorithmic similarity between new and original models play an important role in performance. For similar architectures, the performance is close to that of using the original data (e.g., Accuracy difference of 0.56%-3.82%, Kappa coefficient difference of 0.02-0.08). Further experiments show that the stimuli can provide state-ofthe-art resilience against adversarial association and membership inference attacks.

[1]  Thomas Plagemann,et al.  Machine Learning for Sleep Apnea Detection with Unattended Sleep Monitoring at Home , 2021, ACM Trans. Comput. Heal..

[2]  Thomas Plagemann,et al.  A Clinical Evaluation of a Low-Cost Strain Gauge Respiration Belt and Machine Learning to Detect Sleep Apnea , 2021, Smart Health.

[3]  Adam Byerly,et al.  No routing needed between capsules , 2020, Neurocomputing.

[4]  G. Traaen,et al.  Prevalence, risk factors, and type of sleep apnea in patients with paroxysmal atrial fibrillation , 2019, International journal of cardiology. Heart & vasculature.

[5]  D. Song,et al.  The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  P. Kairouz,et al.  Censored and Fair Universal Representations using Generative Adversarial Models , 2019 .

[7]  Frank Lindseth,et al.  DeepPrivacy: A Generative Adversarial Network for Face Anonymization , 2019, ISVC.

[8]  Mohamed Baza,et al.  Mimic Learning to Generate a Shareable Network Intrusion Detection Model , 2019, 2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC).

[9]  Carlos R. Ponce,et al.  Evolving Images for Visual Neurons Using a Deep Generative Network Reveals Coding Principles and Neuronal Preferences , 2019, Cell.

[10]  David Evans,et al.  Evaluating Differentially Private Machine Learning in Practice , 2019, USENIX Security Symposium.

[11]  Ju Ren,et al.  GANobfuscator: Mitigating Information Leakage Under GAN via Differential Privacy , 2019, IEEE Transactions on Information Forensics and Security.

[12]  Sungroh Yoon,et al.  AnomiGAN: Generative Adversarial Networks for Anonymizing Private Medical Data , 2019, PSB.

[13]  Thomas Plagemann,et al.  Data Mining for Patient Friendly Apnea Detection , 2018, IEEE Access.

[14]  Thor Edvardsen,et al.  Treatment of sleep apnea in patients with paroxysmal atrial fibrillation: design and rationale of a randomized controlled trial , 2018, Scandinavian cardiovascular journal : SCJ.

[15]  Mihaela van der Schaar,et al.  PATE-GAN: Generating Synthetic Data with Differential Privacy Guarantees , 2018, ICLR.

[16]  S. Bellovin,et al.  Privacy and Synthetic Datasets , 2018 .

[17]  Jeffrey L. Gunter,et al.  Medical Image Synthesis for Data Augmentation and Anonymization using Generative Adversarial Networks , 2018, SASHIMI@MICCAI.

[18]  Douglas Tapper,et al.  Sleep apnea. , 2018, Otolaryngologic clinics of North America.

[19]  Yoshua Bengio,et al.  Learning Anonymized Representations with Adversarial Neural Networks , 2018, ArXiv.

[20]  Fei Wang,et al.  Differentially Private Generative Adversarial Network , 2018, ArXiv.

[21]  Tao Zhang,et al.  A Survey of Model Compression and Acceleration for Deep Neural Networks , 2017, ArXiv.

[22]  Junmo Kim,et al.  A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[24]  Stanislas Chambon,et al.  A Deep Learning Architecture for Temporal Sleep Stage Classification Using Multivariate and Multimodal Time Series , 2017, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[25]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[26]  Ian S. Fischer,et al.  Adversarial Transformation Networks: Learning to Generate Adversarial Examples , 2017, ArXiv.

[27]  Eduard Ayguadé,et al.  On the Behavior of Convolutional Nets for Feature Extraction , 2017, J. Artif. Intell. Res..

[28]  Ilya Mironov,et al.  Rényi Differential Privacy , 2017, 2017 IEEE 30th Computer Security Foundations Symposium (CSF).

[29]  Giuseppe Ateniese,et al.  Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.

[30]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[31]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[32]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[33]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[34]  Xiaogang Wang,et al.  Face Model Compression by Distilling Knowledge from Neurons , 2016, AAAI.

[35]  Jason Yosinski,et al.  Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks , 2016, ArXiv.

[36]  Tianqi Chen,et al.  Net2Net: Accelerating Learning via Knowledge Transfer , 2015, ICLR.

[37]  Stefano Ermon,et al.  Transfer Learning from Deep Features for Remote Sensing and Poverty Mapping , 2015, AAAI.

[38]  R. Venkatesh Babu,et al.  Data-free Parameter Pruning for Deep Neural Networks , 2015, BMVC.

[39]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[40]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[41]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[42]  Samira Ebrahimi Kahou,et al.  FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[43]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[45]  Léon Bottou,et al.  Learning Image Embeddings using Convolutional Neural Networks for Improved Multi-Modal Semantics , 2014, EMNLP.

[46]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[47]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[48]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[49]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[50]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[51]  N. Punjabi The epidemiology of adult obstructive sleep apnea. , 2008, Proceedings of the American Thoracic Society.

[52]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[53]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[54]  L. Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[55]  Aeilko H. Zwinderman,et al.  Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the EEG , 2000, IEEE Transactions on Biomedical Engineering.

[56]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[57]  P. Welch The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms , 1967 .

[58]  Hans Selye,et al.  UNIVERSITY OF MONTREAL , 1962 .

[59]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[60]  Cheryl J. Wakslak,et al.  Journal of Experimental Psychology: General , 2013 .

[61]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[62]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[63]  A. Reber Implicit learning and tacit knowledge , 1993 .