暂无分享,去创建一个
Shaojin Ding | Ke Hu | Ye Jia | Quan Wang | Quan Wang | Ye Jia | Shaojin Ding | Ke Hu
[1] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[3] Liang Chen,et al. Deep Neural Network Based Regression Approach for Acoustic Echo Cancellation , 2019, ICMSSP 2019.
[4] Gerhard Schmidt,et al. Acoustic echo and noise control , 2004 .
[5] Tomohiro Nakatani,et al. Single Channel Target Speaker Extraction and Recognition with Speaker Beam , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Jungwon Lee,et al. Deep Multitask Acoustic Echo Cancellation , 2019, INTERSPEECH.
[7] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[8] Jun Wang,et al. Deep Extractor Network for Target Speaker Recovery From Single Channel Speech Mixtures , 2018, INTERSPEECH.
[9] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[10] Gerald Enzner,et al. Acoustic Echo Control , 2014 .
[11] Lior Wolf,et al. Attention-based Wavenet Autoencoder for Universal Voice Conversion , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] John R. Hershey,et al. VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking , 2018, INTERSPEECH.
[13] Lei He,et al. Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS , 2019, INTERSPEECH.
[14] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[15] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..
[16] Quoc V. Le,et al. Improved Noisy Student Training for Automatic Speech Recognition , 2020, INTERSPEECH.
[17] Junichi Yamagishi,et al. SUPERSEDED - CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit , 2016 .
[18] DeLiang Wang,et al. Deep Learning for Acoustic Echo Cancellation in Noisy and Double-Talk Scenarios , 2018, INTERSPEECH.
[19] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Heiga Zen,et al. LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech , 2019, INTERSPEECH.
[21] Philip Chan,et al. Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..
[22] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[23] Melvin Johnson,et al. Direct speech-to-speech translation with a sequence-to-sequence model , 2019, INTERSPEECH.
[24] E. A. Martin,et al. Multi-style training for robust isolated-word speech recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[25] Jindrich Libovický,et al. Attention Strategies for Multi-Source Sequence-to-Sequence Learning , 2017, ACL.
[26] Tara N. Sainath,et al. Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home , 2017, INTERSPEECH.
[27] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] E. Hänsler,et al. Acoustic Echo and Noise Control: A Practical Approach , 2004 .
[29] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[30] Tara N. Sainath,et al. Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling , 2019, ArXiv.
[31] Jungwon Lee,et al. CAD-AEC: Context-Aware Deep Acoustic Echo Cancellation , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Paris Smaragdis,et al. Deep learning for monaural speech separation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] Jacob Benesty,et al. Advances in Network and Acoustic Echo Cancellation , 2001 .
[34] Tara N. Sainath,et al. Deliberation Model Based Two-Pass End-To-End Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[36] R. Kubichek,et al. Mel-cepstral distance measure for objective speech quality assessment , 1993, Proceedings of IEEE Pacific Rim Conference on Communications Computers and Signal Processing.
[37] Jun Du,et al. Speech separation of a target speaker based on deep neural networks , 2014, 2014 12th International Conference on Signal Processing (ICSP).
[38] Fadi Biadsy,et al. Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation , 2019, INTERSPEECH.
[39] Kenneth Heafield,et al. Multi-Source Syntactic Neural Machine Translation , 2018, EMNLP.
[40] Yuxuan Wang,et al. Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron , 2018, ICML.
[41] Dit-Yan Yeung,et al. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.
[42] Sanjeev Khudanpur,et al. A study on data augmentation of reverberant speech for robust speech recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[43] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[44] Nima Mesgarani,et al. TaSNet: Time-Domain Audio Separation Network for Real-Time, Single-Channel Speech Separation , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[45] DeLiang Wang,et al. An Unsupervised Approach to Cochannel Speech Separation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[46] Patrick Nguyen,et al. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis , 2018, NeurIPS.
[47] Nenghai Yu,et al. Deliberation Networks: Sequence Generation Beyond One-Pass Decoding , 2017, NIPS.