Speaker information modification in the VoicePrivacy 2020 toolchain
暂无分享,去创建一个
[1] Stephen McAdams,et al. Spectral fusion, spectral parsing and the formation of auditory images , 1984 .
[2] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[3] Jerzy Sas,et al. Gender recognition using neural networks and ASR techniques , 2013 .
[4] Haizhou Li,et al. Exemplar-Based Sparse Representation With Residual Compensation for Voice Conversion , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[5] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Sanjeev Khudanpur,et al. A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.
[7] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[8] Hao Wang,et al. Phonetic posteriorgrams for many-to-one voice conversion without parallel data training , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).
[9] Daniel Erro,et al. Reversible speaker de-identification using pre-trained transformation functions , 2017, Comput. Speech Lang..
[10] John R. Hershey,et al. Hybrid CTC/Attention Architecture for End-to-End Speech Recognition , 2017, IEEE Journal of Selected Topics in Signal Processing.
[11] Seyed Hamidreza Mohammadi,et al. An overview of voice conversion systems , 2017, Speech Commun..
[12] Shinji Watanabe,et al. ESPnet: End-to-End Speech Processing Toolkit , 2018, INTERSPEECH.
[13] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Shinnosuke Takamichi,et al. Non-Parallel Voice Conversion Using Variational Autoencoders Conditioned by Phonetic Posteriorgrams and D-Vectors , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Tetsuji Ogawa,et al. Speaker Invariant Feature Extraction for Zero-Resource Languages with Adversarial Learning , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Yiming Wang,et al. Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks , 2018, INTERSPEECH.
[17] Dan Qu,et al. Towards end-to-end speech recognition with transfer learning , 2018, EURASIP Journal on Audio, Speech, and Music Processing.
[18] Yoshua Bengio,et al. Learning Anonymized Representations with Adversarial Neural Networks , 2018, ArXiv.
[19] Albert Y. S. Lam,et al. Domain Adaptation of End-to-end Speech Recognition in Low-Resource Settings , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[20] Taku Kudo,et al. Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates , 2018, ACL.
[21] Junichi Yamagishi,et al. Speaker Anonymization Using X-vector and Neural Waveform Models , 2019, 10th ISCA Workshop on Speech Synthesis (SSW 10).
[22] Xin Wang,et al. Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis , 2019, ArXiv.
[23] Francis Bach,et al. Partially Encrypted Deep Learning using Functional Encryption , 2019, NeurIPS.
[24] Nicolas Usunier,et al. To Reverse the Gradient or Not: an Empirical Comparison of Adversarial and Multi-task Learning in Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Marc Tommasi,et al. Privacy-Preserving Adversarial Representation Learning in ASR: Reality or Illusion? , 2019, INTERSPEECH.
[26] Isabel Trancoso,et al. The GDPR & Speech Data: Reflections of Legal and Technology Communities, First Steps towards a Common Understanding , 2019, INTERSPEECH.
[27] Junichi Yamagishi,et al. Introducing the VoicePrivacy Initiative , 2020, INTERSPEECH.