Learning Filterbanks from Raw Speech for Phone Recognition
暂无分享,去创建一个
Iasonas Kokkinos | Nicolas Usunier | Emmanuel Dupoux | Neil Zeghidour | Gabriel Synnaeve | Thomas Schatz | Nicolas Usunier | Gabriel Synnaeve | Emmanuel Dupoux | Neil Zeghidour | Thomas Schatz | Iasonas Kokkinos
[1] J. L. Flanagan,et al. Parametric coding of speech spectra , 1980 .
[2] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.
[3] Geoffrey Zweig,et al. Achieving Human Parity in Conversational Speech Recognition , 2016, ArXiv.
[4] Michael S. Lewicki,et al. Efficient auditory coding , 2006, Nature.
[5] Dimitri Palaz,et al. Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks , 2013, INTERSPEECH.
[6] Ron J. Weiss,et al. Speech acoustic modeling from raw multichannel waveforms , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Dimitri Palaz,et al. End-to-end Phoneme Sequence Recognition using Convolutional Neural Networks , 2013, ArXiv.
[8] Maarten Versteegh,et al. A deep scattering spectrum — Deep Siamese network pipeline for unsupervised acoustic modeling , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Joakim Andén,et al. Deep Scattering Spectrum , 2013, IEEE Transactions on Signal Processing.
[10] Gabriel Synnaeve,et al. Wav2Letter: an End-to-End ConvNet-based Speech Recognition System , 2016, ArXiv.
[11] Ying Zhang,et al. Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks , 2016, INTERSPEECH.
[12] Tara N. Sainath,et al. Deep Scattering Spectrum with deep neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[14] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[15] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[16] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[17] Satoshi Nakamura,et al. Attention-based Wav2Text with feature transfer learning , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[18] Qiang Chen,et al. Network In Network , 2013, ICLR.
[19] Carla Teixeira Lopes,et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .
[20] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[21] Liang Lu,et al. Segmental Recurrent Neural Networks for End-to-End Speech Recognition , 2016, INTERSPEECH.
[22] M. Picheny,et al. Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .
[23] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[24] László Tóth. Phone recognition with hierarchical convolutional deep maxout networks , 2015, EURASIP J. Audio Speech Music. Process..