Low Bit-rate Speech Coding with VQ-VAE and a WaveNet Decoder
暂无分享,去创建一个
Thomas C. Walters | Oriol Vinyals | Felicia S. C. Lim | Aäron van den Oord | Cristina Garbacea | Yazhe Li | Alejandro Luebs | Oriol Vinyals | Yazhe Li | Cristina Garbacea | Alejandro Luebs
[1] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[2] Luca Benini,et al. Soft-to-Hard Vector Quantization for End-to-End Learned Compression of Images and Neural Networks , 2017, ArXiv.
[3] Srihari Kankanahalli,et al. End-To-End Optimized Speech Coding with Deep Neural Networks , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Richard C. Hendriks,et al. On the information rate of speech communication , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Patrick Nguyen,et al. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis , 2018, NeurIPS.
[7] Valero Laparra,et al. End-to-end Optimized Image Compression , 2016, ICLR.
[8] Jean-Marc Valin,et al. Speex: A Free Codec For Free Speech , 2016, ArXiv.
[9] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[10] Roch Lefebvre,et al. The adaptive multirate wideband speech codec (AMR-WB) , 2002, IEEE Trans. Speech Audio Process..
[11] Milos Cernak,et al. Composition of Deep and Spiking Neural Networks for Very Low Bit Rate Speech Coding , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[12] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[13] Lubomir D. Bourdev,et al. Real-Time Adaptive Image Compression , 2017, ICML.
[14] Lucas Theis,et al. Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.
[15] METHODS FOR SUBJECTIVE DETERMINATION OF TRANSMISSION QUALITY Summary , 2022 .
[16] Thomas P. Barnwell,et al. A 2.4 kbit/s MELP coder candidate for the new U.S. Federal Standard , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[17] Heiga Zen,et al. Sample Efficient Adaptive Text-to-Speech , 2018, ICLR.
[18] Quan Wang,et al. Wavenet Based Low Rate Speech Coding , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).