Apprentissage automatique de représentation de voix à l’aide d’une distillation de la connaissance pour le casting vocal (Learning voice representation using knowledge distillation for automatic voice casting )
暂无分享,去创建一个
Richard Dufour | Mathias Quillot | Adrien Gresse | Jean-Francçois Bonastre | Richard Dufour | J. Bonastre | Adrien Gresse | Mathias Quillot
[1] Sanjeev Khudanpur,et al. Deep neural network-based speaker embeddings for end-to-end speaker verification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[2] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[3] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[4] Ryo Masumura,et al. Domain adaptation of DNN acoustic models using knowledge distillation , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Jonathan Le Roux,et al. Student-teacher network learning with enhanced features , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Axel Röbel,et al. Similarity Search of Acted Voices for Automatic Voice Casting , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] Rauf Izmailov,et al. Learning using privileged information: similarity control and knowledge transfer , 2015, J. Mach. Learn. Res..
[8] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[9] Joon Son Chung,et al. VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.
[10] Neethu Mariam Joy,et al. Generalized Distillation Framework for Speaker Normalization , 2017, INTERSPEECH.
[11] Tomoko Matsui,et al. Robust Speech Recognition Using Generalized Distillation Framework , 2016, INTERSPEECH.
[12] Koichi Shinoda,et al. Wise teachers train better DNN acoustic models , 2016, EURASIP J. Audio Speech Music. Process..
[13] Bernhard Schölkopf,et al. Unifying distillation and privileged information , 2015, ICLR.
[14] Sanjeev Khudanpur,et al. Deep Neural Network Embeddings for Text-Independent Speaker Verification , 2017, INTERSPEECH.
[15] Mickael Rouvier,et al. Acoustic Pairing of Original and Dubbed Voices in the Context of Video Game Localization , 2017, INTERSPEECH.
[16] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[17] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Axel Röbel,et al. On automatic voice casting for expressive speech: Speaker recognition vs. speech classification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Yifan Gong,et al. Large-Scale Domain Adaptation via Teacher-Student Learning , 2017, INTERSPEECH.
[20] Erik McDermott,et al. Deep neural networks for small footprint text-dependent speaker verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[22] Jean-François Bonastre,et al. Similarity Metric Based on Siamese Neural Networks for Voice Casting , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).