Source-Aware Neural Speech Coding for Noisy Speech Compression
暂无分享,去创建一个
Minje Kim | Haici Yang | Kai Zhen | Seungkwon Beack | Minje Kim | Seungkwon Beack | Haici Yang | Kai Zhen
[1] Minje Kim,et al. Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding , 2019, INTERSPEECH.
[2] Nima Mesgarani,et al. TaSNet: Time-Domain Audio Separation Network for Real-Time, Single-Channel Speech Separation , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[4] Andreas Niedermeier,et al. Intelligent Gap Filling in Perceptual Transform Coding of Audio , 2016 .
[5] Srihari Kankanahalli,et al. End-To-End Optimized Speech Coding with Deep Neural Networks , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Carla Teixeira Lopes,et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .
[7] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[8] Lucas Theis,et al. Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.
[9] Jan Plogsties,et al. MPEG-H Audio—The New Standard for Universal Spatial / 3D Audio Coding , 2014 .
[10] Paris Smaragdis,et al. Online PLCA for Real-Time Semi-supervised Source Separation , 2012, LVA/ICA.
[11] Minje Kim,et al. Efficient and Scalable Neural Residual Waveform Coding with Collaborative Quantization , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Daniel Rueckert,et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Luca Benini,et al. Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations , 2017, NIPS.
[14] Jan Skoglund,et al. A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet , 2019, INTERSPEECH.
[15] Roch Lefebvre,et al. The adaptive multirate wideband speech codec (AMR-WB) , 2002, IEEE Trans. Speech Audio Process..
[16] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Quan Wang,et al. Wavenet Based Low Rate Speech Coding , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Jesper Jensen,et al. A short-time objective intelligibility measure for time-frequency weighted noisy speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[19] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[20] Guillaume Fuchs,et al. Frequency-domain Comfort Noise Generation for Discontinuous Transmission in EVS , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Timothy B. Terriberry,et al. Definition of the Opus Audio Codec , 2012, RFC.
[22] Manfred R. Schroeder,et al. Code-excited linear prediction(CELP): High-quality speech at very low bit rates , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[23] Michael Chinen,et al. Robust Low Rate Speech Coding Based on Cloned Networks and Wavenet , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Jonathan Le Roux,et al. SDR – Half-baked or Well Done? , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).