暂无分享,去创建一个
[1] Tiago H. Falk,et al. Investigating the effect of residual and highway connections in speech enhancement models , 2018 .
[2] Xiaofei Wang,et al. Stream Attention-based Multi-array End-to-end Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] John-Paul Hosom. Automatic phoneme alignment based on acoustic-phonetic modeling , 2002, INTERSPEECH.
[4] Zhizheng Wu,et al. Investigating gated recurrent neural networks for speech synthesis , 2016, ArXiv.
[5] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Yajie Miao,et al. EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[7] Shuai Wang,et al. What Does the Speaker Embedding Encode? , 2017, INTERSPEECH.
[8] James R. Glass,et al. Learning Word-Like Units from Joint Audio-Visual Analysis , 2017, ACL.
[9] Sanjeev Khudanpur,et al. A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.
[10] Gabriel Synnaeve,et al. Wav2Letter++: A Fast Open-source Speech Recognition System , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Benjamin Lecouteux,et al. Analyzing Learned Representations of a Deep ASR Performance Prediction Model , 2018, BlackboxNLP@EMNLP.
[12] Ahmed Mohamed Abdel Maksoud Ali,et al. Multi-dialect Arabic broadcast speech recognition , 2018 .
[13] Yonatan Belinkov,et al. Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems , 2017, NIPS.
[14] Grzegorz Chrupala,et al. Encoding of phonology in a recurrent neural model of grounded speech , 2017, CoNLL.
[15] James R. Glass,et al. A complete KALDI recipe for building Arabic speech recognition systems , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[16] Sara Veldhoen,et al. Visualisation and 'Diagnostic Classifiers' Reveal How Recurrent and Recursive Neural Networks Process Hierarchical Structure , 2018, J. Artif. Intell. Res..
[17] Hao Tang,et al. Frame-Level Speaker Embeddings for Text-Independent Speaker Recognition and Analysis of End-to-End Model , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[18] Yonatan Belinkov,et al. Analysis Methods in Neural Language Processing: A Survey , 2018, TACL.
[19] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Xiangang Li,et al. Towards End-to-End Code-Switching Speech Recognition , 2018, ArXiv.
[21] Yu Wang,et al. PHONETIC AND GRAPHEMIC SYSTEMS FOR MULTI-GENRE BROADCAST TRANSCRIPTION , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] James R. Glass,et al. The MGB-2 challenge: Arabic multi-dialect broadcast media recognition , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[23] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[24] Paul Deléglise,et al. TED-LIUM: an Automatic Speech Recognition dedicated corpus , 2012, LREC.
[25] James Glass,et al. Analysis of Audio-Visual Features for Unsupervised Speech Recognition , 2017 .
[26] Willem H. Zuidema,et al. Visualisation and 'diagnostic classifiers' reveal how recurrent and recursive neural networks process hierarchical structure , 2017, J. Artif. Intell. Res..
[27] Sameer Khurana,et al. QCRI advanced transcription system (QATS) for the Arabic Multi-Dialect Broadcast media recognition: MGB-2 challenge , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[28] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[29] Yonatan Belinkov,et al. Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks , 2016, ICLR.
[30] Tasha Nagamine,et al. Exploring how deep neural networks form phonemic categories , 2015, INTERSPEECH.
[31] Grzegorz Chrupala,et al. Representations of language in a model of visually grounded speech signal , 2017, ACL.
[32] Hung-yi Lee,et al. Gate Activation Signal Analysis for Gated Recurrent Neural Networks and its Correlation with Phoneme Boundaries , 2017, INTERSPEECH.
[33] Sebastian Stober,et al. Neuron Activation Profiles for Interpreting Convolutional Speech Recognition Models , 2018 .
[34] Tasha Nagamine,et al. On the Role of Nonlinear Transformations in Deep Neural Network Acoustic Models , 2016, INTERSPEECH.
[35] Mikko Kurimo,et al. Aalto system for the 2017 Arabic multi-genre broadcast challenge , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).