论文信息 - Robust Noisy Speech Parameterization Using Convolutional Neural Networks - 字舞流文

Robust Noisy Speech Parameterization Using Convolutional Neural Networks

Elias Azarov | Ryhor Vashkevich

[1] Ji Wu,et al. Denoising deep neural networks based voice activity detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2] Ryan Prenger,et al. Waveglow: A Flow-based Generative Network for Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[4] Jing Pang. Spectrum energy based voice activity detection , 2017, 2017 IEEE 7th Annual Computing and Communication Workshop and Conference (CCWC).

[5] Gabriel Synnaeve,et al. Wav2Letter: an End-to-End ConvNet-based Speech Recognition System , 2016, ArXiv.

[6] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.

[7] Thad Hughes,et al. Recurrent neural networks for voice activity detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8] Tara N. Sainath,et al. Feature Learning with Raw-Waveform CLDNNs for Voice Activity Detection , 2016, INTERSPEECH.

[9] Mark Liberman,et al. Speech activity detection on youtube using deep neural networks , 2013, INTERSPEECH.

[10] Ted Painter,et al. Audio Signal Processing and Coding , 2007 .

[11] Cassia Valentini-Botinhao,et al. Noisy speech database for training speech enhancement algorithms and TTS models , 2017 .

[12] Jun Du,et al. A universal VAD based on jointly trained deep neural networks , 2015, INTERSPEECH.

[13] Björn W. Schuller,et al. Real-life voice activity detection with LSTM Recurrent Neural Networks and an application to Hollywood movies , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14] Hyeontaek Lim,et al. Formant-Based Robust Voice Activity Detection , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.