CIF: Continuous Integrate-And-Fire for End-To-End Speech Recognition
暂无分享,去创建一个
Linhao Dong | Bo Xu | Bo Xu | Linhao Dong
[1] Wonyong Sung,et al. Fully Neural Network Based Speech Recognition on Mobile and Embedded Devices , 2018, NeurIPS.
[2] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[3] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[4] Wolfgang Maass,et al. Networks of Spiking Neurons: The Third Generation of Neural Network Models , 1996, Electron. Colloquium Comput. Complex..
[5] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[6] Navdeep Jaitly,et al. Towards Better Decoding and Language Model Integration in Sequence to Sequence Models , 2016, INTERSPEECH.
[7] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[8] Matt Shannon,et al. Recurrent Neural Aligner: An Encoder-Decoder Neural Network Model for Sequence to Sequence Mapping , 2017, INTERSPEECH.
[9] Sanjeev Khudanpur,et al. Audio augmentation for speech recognition , 2015, INTERSPEECH.
[10] Alex Graves,et al. Video Pixel Networks , 2016, ICML.
[11] Samy Bengio,et al. An Online Sequence-to-Sequence Model Using Partial Conditioning , 2015, NIPS.
[12] Wofgang Maas,et al. Networks of spiking neurons: the third generation of neural network models , 1997 .
[13] Colin Raffel,et al. Monotonic Chunkwise Attention , 2017, ICLR.
[14] Gabriel Synnaeve,et al. Letter-Based Speech Recognition with Gated ConvNets , 2017, ArXiv.
[15] Boris Ginsburg,et al. Jasper: An End-to-End Convolutional Neural Acoustic Model , 2019, INTERSPEECH.
[16] Shiliang Zhang,et al. Gaussian Prediction Based Attention for Online End-to-End Speech Recognition , 2017, INTERSPEECH.
[17] Ronan Collobert,et al. Wav2Letter++: A Fast Open-source Speech Recognition System , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Peter W. Jusczyk,et al. How infants begin to extract words from speech , 1999, Trends in Cognitive Sciences.
[19] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[20] Shinji Watanabe,et al. ESPnet: End-to-End Speech Processing Toolkit , 2018, INTERSPEECH.
[21] Luke S. Zettlemoyer,et al. Transformers with convolutional context for ASR , 2019, ArXiv.
[22] Pascale Fung,et al. HKUST/MTS: A Very Large Scale Mandarin Telephone Speech Corpus , 2006, ISCSLP.
[23] Satoshi Nakamura,et al. Local Monotonic Attention Mechanism for End-to-End Speech And Language Processing , 2017, IJCNLP.
[24] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[25] Jonathan Le Roux,et al. Triggered Attention for End-to-end Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Mohan Li,et al. End-to-end Speech Recognition with Adaptive Computation Steps , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Anthony N. Burkitt,et al. A Review of the Integrate-and-fire Neuron Model: I. Homogeneous Synaptic Input , 2006, Biological Cybernetics.
[28] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[30] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Hui Bu,et al. AISHELL-2: Transforming Mandarin ASR Research Into Industrial Scale , 2018, ArXiv.
[32] Gang Liu,et al. An Online Attention-based Model for Speech Recognition , 2018, INTERSPEECH.
[33] Shinji Watanabe,et al. Joint CTC-attention based end-to-end speech recognition using multi-task learning , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[35] Wei Chen,et al. Extending Recurrent Neural Aligner for Streaming End-to-End Speech Recognition in Mandarin , 2018, INTERSPEECH.
[36] Richard Socher,et al. Improving End-to-End Speech Recognition with Policy Learning , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[37] L. F Abbott,et al. Lapicque’s introduction of the integrate-and-fire model neuron (1907) , 1999, Brain Research Bulletin.
[38] Bo Xu,et al. Self-attention Aligner: A Latency-control End-to-end Model for ASR Using Self-attention Network and Chunk-hopping , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Hermann Ney,et al. An Analysis of Local Monotonic Attention Variants , 2019, INTERSPEECH.
[40] Gabriel Synnaeve,et al. Wav2Letter++: A Fast Open-source Speech Recognition System , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[41] Shuang Xu,et al. A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese , 2018, ICONIP.
[42] Zhiheng Huang,et al. Self-attention Networks for Connectionist Temporal Classification in Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[43] Hermann Ney,et al. RWTH ASR Systems for LibriSpeech: Hybrid vs Attention - w/o Data Augmentation , 2019, INTERSPEECH.
[44] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[45] Hermann Ney,et al. Improved training of end-to-end attention models for speech recognition , 2018, INTERSPEECH.
[46] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[47] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[48] Yiming Wang,et al. Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI , 2016, INTERSPEECH.