Privacy-preserving Voice Analysis via Disentangled Representations
暂无分享,去创建一个
[1] Nikita Borisov,et al. Property Inference Attacks on Fully Connected Neural Networks using Permutation Invariant Representations , 2018, CCS.
[2] Heiga Zen,et al. Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Hyung-Min Park,et al. Unsupervised Speech Domain Adaptation Based on Disentangled Representation Learning for Robust Speech Recognition , 2019, ArXiv.
[4] Vitaly Shmatikov,et al. Overlearning Reveals Sensitive Attributes , 2019, ICLR.
[5] Somesh Jha,et al. Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).
[6] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[7] Bumsub Ham,et al. Learning Disentangled Representation for Robust Person Re-identification , 2019, NeurIPS.
[8] Christian Poellabauer,et al. Towards Learning Fine-Grained Disentangled Representations from Speech , 2018, ArXiv.
[9] Yu Tsao,et al. Unsupervised Representation Disentanglement Using Cross Domain Features and Adversarial Learning in Variational Autoencoder Based Voice Conversion , 2020, IEEE Transactions on Emerging Topics in Computational Intelligence.
[10] Yann LeCun,et al. Disentangling factors of variation in deep representation using adversarial training , 2016, NIPS.
[11] Nassir Navab,et al. Fairness by Learning Orthogonal Disentangled Representations , 2020, ECCV.
[12] Joon Son Chung,et al. Utterance-level Aggregation for Speaker Recognition in the Wild , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] I-Fan Chen,et al. End-to-end Anchored Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Shuicheng Yan,et al. Look Across Elapse: Disentangled Representation Learning and Photorealistic Cross-Age Face Synthesis for Age-Invariant Face Recognition , 2018, AAAI.
[15] Vitaly Shmatikov,et al. Machine Learning Models that Remember Too Much , 2017, CCS.
[16] Carlos Busso,et al. IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.
[17] Yuting Zhang,et al. Learning to Disentangle Factors of Variation with Manifold Interaction , 2014, ICML.
[18] Amos J. Storkey,et al. Censoring Representations with an Adversary , 2015, ICLR.
[19] Hamed Haddadi,et al. Emotion Filtering at the Edge , 2019, SenSys-ML.
[20] Mingyan Liu,et al. Group Retention when Using Machine Learning in Sequential Decision Making: the Interplay between User Dynamics and Fairness , 2019, NeurIPS.
[21] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[22] Shrikanth Narayanan,et al. An empirical analysis of information encoded in disentangled neural speaker representations , 2020, ArXiv.
[23] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Alexei Baevski,et al. vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations , 2019, ICLR.
[25] Jinyuan Jia,et al. AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning , 2018, USENIX Security Symposium.
[26] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[27] Carmela Troncoso,et al. Protecting location privacy: optimal strategy against localization attacks , 2012, CCS.
[28] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[29] Toniann Pitassi,et al. Flexibly Fair Representation Learning by Disentanglement , 2019, ICML.
[30] Stefan Bauer,et al. On the Fairness of Disentangled Representations , 2019, NeurIPS.
[31] Dan Jurafsky,et al. Racial disparities in automated speech recognition , 2020, Proceedings of the National Academy of Sciences.
[32] Vitaly Shmatikov,et al. Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).
[33] Hideki Kawahara,et al. STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds , 2006 .
[34] Toniann Pitassi,et al. Learning Adversarially Fair and Transferable Representations , 2018, ICML.
[35] Kate Saenko,et al. Domain Agnostic Learning with Disentangled Representations , 2019, ICML.
[36] Rob Brekelmans,et al. Invariant Representations without Adversarial Training , 2018, NeurIPS.
[37] Amir Houmansadr,et al. Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).
[38] Hamed Haddadi,et al. When Speakers Are All Ears: Characterizing Misactivations of IoT Smart Speakers , 2020, Proc. Priv. Enhancing Technol..
[39] Seunghoon Hong,et al. High-Fidelity Synthesis with Disentangled Representation , 2020, ECCV.
[40] Linlin Chen,et al. Hidebehind: Enjoy Voice Input with Voiceprint Unclonability and Anonymity , 2018, SenSys.
[41] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[42] Yu Zhang,et al. Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data , 2017, NIPS.
[43] Bo Luo,et al. I Know What You See: Power Side-Channel Attack on Convolutional Neural Network Accelerators , 2018, ACSAC.
[44] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[45] Yi-Hsuan Tsai,et al. Domain Adaptation for Structured Output via Disentangled Patch Representations , 2018 .
[46] Stefan Wermter,et al. Predictive Auxiliary Variational Autoencoder for Representation Learning of Global Speech Characteristics , 2019, INTERSPEECH.
[47] Zhen-Hua Ling,et al. Learning Latent Representations for Style Control and Transfer in End-to-end Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[48] Herman Kamper,et al. Unsupervised Feature Learning for Speech Using Correspondence and Siamese Networks , 2020, IEEE Signal Processing Letters.
[49] Pieter Abbeel,et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.
[50] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[51] Ersin Yumer,et al. Neural Face Editing with Intrinsic Image Disentangling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Jae S. Lim,et al. Signal estimation from modified short-time Fourier transform , 1983, ICASSP.
[53] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[54] Hyunsoo Kim,et al. Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.
[55] Suresh Venkatasubramanian,et al. Disentangling Influence: Using Disentangled Representations to Audit Model Predictions , 2019, NeurIPS.
[56] Ming Li,et al. Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System , 2018, Odyssey.
[57] Rajib Rana,et al. Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends , 2020, ArXiv.
[58] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.
[59] Joshua B. Tenenbaum,et al. Deep Convolutional Inverse Graphics Network , 2015, NIPS.
[60] Jan Paul Kolter. User-centric privacy: a usable and provider-independent privacy infrastructure , 2010 .
[61] Joon Son Chung,et al. In defence of metric learning for speaker recognition , 2020, INTERSPEECH.
[62] Thomas Drugman,et al. Towards Achieving Robust Universal Neural Vocoding , 2018, INTERSPEECH.
[63] Masanori Morise,et al. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications , 2016, IEICE Trans. Inf. Syst..
[64] Guillaume Lample,et al. Fader Networks: Manipulating Images by Sliding Attributes , 2017, NIPS.
[65] Ashish Shrivastava,et al. Unsupervised Style and Content Separation by Minimizing Mutual Information for Speech Synthesis , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[66] Nicholas W. D. Evans,et al. Preserving privacy in speaker and speech characterisation , 2019, Comput. Speech Lang..
[67] Jung-Woo Ha,et al. StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[68] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[69] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[70] Úlfar Erlingsson,et al. The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks , 2018, USENIX Security Symposium.
[71] James D. Edge,et al. Audio-visual feature selection and reduction for emotion classification , 2008, AVSP.
[72] Yusuke Ijima,et al. DNN-Based Speech Synthesis Using Speaker Codes , 2018, IEICE Trans. Inf. Syst..
[73] Vitaly Shmatikov,et al. Exploiting Unintended Feature Leakage in Collaborative Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).
[74] Gregory H. Wakefield,et al. Chromagram visualization of the singing voice , 1999, MAVEBA.
[75] Shrikanth S. Narayanan,et al. On the robustness of overall F0-only modifications to the perception of emotions in speech. , 2008, The Journal of the Acoustical Society of America.
[76] Andrea Cavallaro,et al. Mobile Sensor Data Anonymization , 2019 .
[77] Andriy Mnih,et al. Disentangling by Factorising , 2018, ICML.
[78] Giovanni Felici,et al. Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers , 2013, Int. J. Secur. Networks.
[79] Adam Roberts,et al. Latent Constraints: Learning to Generate Conditionally from Unconditional Generative Models , 2017, ICLR.
[80] Emily Mower Provost,et al. Privacy Enhanced Multimodal Neural Representations for Emotion Recognition , 2019, AAAI.
[81] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[82] Ronan Collobert,et al. wav2vec: Unsupervised Pre-training for Speech Recognition , 2019, INTERSPEECH.
[83] S. R. Livingstone,et al. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English , 2018, PloS one.