Efficient and Scalable Neural Residual Waveform Coding with Collaborative Quantization
暂无分享,去创建一个
Minje Kim | Jongmo Sung | Mi Suk Lee | Kai Zhen | Seungkwon Beack | Mi Suk Lee | Minje Kim | Seungkwon Beack | Jongmo Sung | Kai Zhen
[1] Srihari Kankanahalli,et al. End-To-End Optimized Speech Coding with Deep Neural Networks , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Abeer Alwan,et al. Speech Coding: Fundamentals and Applications , 2003 .
[3] Quan Wang,et al. Wavenet Based Low Rate Speech Coding , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Fumitada Itakura. Early developments of LPC speech coding techniques , 1990, ICSLP.
[5] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[6] Luca Benini,et al. Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations , 2017, NIPS.
[7] Michael Keyhl,et al. Perceptual Objective Listening Quality Assessment (POLQA), The Third Generation ITU-T Standard for End-to-End Speech Quality Measurement Part I-Temporal Alignment , 2013 .
[8] James L. Flanagan,et al. Adaptive quantization in differential PCM coding of speech , 1973 .
[9] W. Bastiaan Kleijn,et al. Rate Distribution Between Model and Signal , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[10] Biing-Hwang Juang,et al. Line spectrum pair (LSP) and speech data compression , 1984, ICASSP.
[11] Minje Kim,et al. Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding , 2019, INTERSPEECH.
[12] Hirokazu Kameoka,et al. Progress in LPC-based frequency-domain audio coding , 2016 .
[13] Thomas C. Walters,et al. Low Bit-rate Speech Coding with VQ-VAE and a WaveNet Decoder , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[15] J.D. Gibson,et al. Speech coding methods, standards, and applications , 2005, IEEE Circuits and Systems Magazine.
[16] Abhinav Kumar,et al. Study and Performance of AMR Codecs for GSM , 2014 .
[17] Erich Elsen,et al. Efficient Neural Audio Synthesis , 2018, ICML.
[18] ITU-T Rec. G.722.2 (07/2003) Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB) , 2004 .
[19] Timothy B. Terriberry,et al. High-Quality, Low-Delay Music Coding in the Opus Codec , 2016, ArXiv.
[20] Jean-Marc Valin,et al. Speex: A Free Codec For Free Speech , 2016, ArXiv.
[21] D. O'Shaughnessy,et al. Linear predictive coding , 1988, IEEE Potentials.
[22] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[23] Gerhard Stoll,et al. ISO-MPEG-1 Audio: A Generic Standard for Coding of High-: Quality Digital Audio , 1994 .
[24] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[25] Jan Skoglund,et al. LPCNET: Improving Neural Speech Synthesis through Linear Prediction , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Jan Skoglund,et al. A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet , 2019, INTERSPEECH.
[27] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Roch Lefebvre,et al. The adaptive multirate wideband speech codec (AMR-WB) , 2002, IEEE Trans. Speech Audio Process..
[29] Andreas Spanias,et al. Speech coding: a tutorial review , 1994, Proc. IEEE.