Timescalenet : A Multiresolution Approach for Raw Audio Recognition
暂无分享,去创建一个
Eric Bavu | Aro Ramamonjy | Hadrien Pujol | Alexandre Garcia | É. Bavu | Alexandre Garcia | Aro Ramamonjy | Hadrien Pujol
[1] Lukasz Kaiser,et al. Depthwise Separable Convolutions for Neural Machine Translation , 2017, ICLR.
[2] Juhan Nam,et al. SampleCNN: End-to-End Deep Convolutional Neural Networks Using Very Small Filters for Music Classification , 2018 .
[3] Wei Dai,et al. Very deep convolutional neural networks for raw waveforms , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Julius O. Smith,et al. Introduction to Digital Filters: with Audio Applications , 2007 .
[5] Pete Warden,et al. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition , 2018, ArXiv.
[6] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.
[7] Ron J. Weiss,et al. Speech acoustic modeling from raw multichannel waveforms , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Yundong Zhang,et al. Hello Edge: Keyword Spotting on Microcontrollers , 2017, ArXiv.
[9] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[10] Sepp Hochreiter,et al. Self-Normalizing Neural Networks , 2017, NIPS.
[11] Malcolm Slaney,et al. An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank , 1997 .
[12] Tara N. Sainath,et al. Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[13] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[14] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[15] Benjamin Schrauwen,et al. End-to-end learning for music audio , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Richard F Lyon,et al. Cascades of two-pole-two-zero asymmetric resonators are good models of peripheral auditory function. , 2011, The Journal of the Acoustical Society of America.
[17] Xavier Serra,et al. A Wavenet for Speech Denoising , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Jimmy J. Lin,et al. Deep Residual Learning for Small-Footprint Keyword Spotting , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[20] Hermann Ney,et al. Acoustic modeling with deep neural networks using raw time signal for LVCSR , 2014, INTERSPEECH.
[21] Roy D. Patterson. Auditory models as preprocessors for speech recognition , 1992 .