CGCNN: Complex Gabor Convolutional Neural Network on Raw Speech
暂无分享,去创建一个
Titouan Parcollet | Mohamed Morchid | Paul-Gauthier No'e | Titouan Parcollet | Mohamed Morchid | Paul-Gauthier Noé
[1] Prashant Parikh. A Theory of Communication , 2010 .
[2] Boris Ginsburg,et al. Jasper: An End-to-End Convolutional Neural Acoustic Model , 2019, INTERSPEECH.
[3] Gerald Sommer,et al. Geometric Computing with Clifford Algebras , 2001, Springer Berlin Heidelberg.
[4] Titouan Parcollet,et al. The Pytorch-kaldi Speech Recognition Toolkit , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Sandeep Subramanian,et al. Deep Complex Networks , 2017, ICLR.
[6] Hermann Ney,et al. Acoustic modeling with deep neural networks using raw time signal for LVCSR , 2014, INTERSPEECH.
[7] Stéphane Mallat,et al. A Wavelet Tour of Signal Processing - The Sparse Way, 3rd Edition , 2008 .
[8] S. Mallat. A wavelet tour of signal processing , 1998 .
[9] Gerald Penn,et al. Improving Speech Recognition with Drop-in Replacements for f-Bank Features , 2019, SLSP.
[10] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[11] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[12] Jung-Woo Ha,et al. Phase-aware Speech Enhancement with Deep Complex U-Net , 2019, ICLR.
[13] Yoshua Bengio,et al. Improving Speech Recognition by Revising Gated Recurrent Units , 2017, INTERSPEECH.
[14] Yoshua Bengio,et al. Speech and Speaker Recognition from Raw Waveform with SincNet , 2018, ArXiv.
[15] Iasonas Kokkinos,et al. Learning Filterbanks from Raw Speech for Phone Recognition , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[17] Ron J. Weiss,et al. Speech acoustic modeling from raw multichannel waveforms , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Dimitri Palaz,et al. End-to-end Phoneme Sequence Recognition using Convolutional Neural Networks , 2013, ArXiv.
[19] Peter Bell,et al. Regularization of context-dependent deep neural networks with context-independent multi-task training , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Nicolas Usunier,et al. End-to-End Speech Recognition From the Raw Waveform , 2018, INTERSPEECH.
[21] Ying Zhang,et al. Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks , 2016, INTERSPEECH.
[22] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[23] Steve Renals,et al. On Learning Interpretable CNNs with Parametric Modulated Kernel-Based Filters , 2019, INTERSPEECH.
[24] Yoshua Bengio,et al. Speaker Recognition from Raw Waveform with SincNet , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[25] Shinji Watanabe,et al. ESPnet: End-to-End Speech Processing Toolkit , 2018, INTERSPEECH.
[26] Jonathan G. Fiscus,et al. DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .
[27] Shinji Watanabe,et al. Joint CTC-attention based end-to-end speech recognition using multi-task learning , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).