Native Language Identification from Raw Waveforms Using Deep Convolutional Neural Networks with Attentive Pooling
暂无分享,去创建一个
Keelan Evanini | Chong Min Lee | Chee Wee Leong | Rutuja Ubale | Vikram Ramanarayanan | Yao Qian | Yao Qian | C. W. Leong | Keelan Evanini | Vikram Ramanarayanan | Rutuja Ubale
[1] Frank K. Soong,et al. From Speech Signals to Semantics — Tagging Performance at Acoustic, Phonetic and Word Levels , 2018, 2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP).
[2] Tara N. Sainath,et al. Learning filter banks within a deep neural network framework , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[3] Georg Heigold,et al. End-to-end text-dependent speaker verification , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Yu Zhang,et al. Very deep convolutional networks for end-to-end speech recognition , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Y. Nesterov. A method for unconstrained convex minimization problem with the rate of convergence o(1/k^2) , 1983 .
[6] Björn W. Schuller,et al. Convolutional Neural Networks with Data Augmentation for Classifying Speakers' Native Language , 2016, INTERSPEECH.
[7] Visar Berisha,et al. Accent Identification by Combining Deep Neural Networks and Recurrent Neural Networks Trained on Long and Short Term Features , 2016, INTERSPEECH.
[8] Yuanyuan Zhang,et al. Attention Based Fully Convolutional Network for Speech Emotion Recognition , 2018, 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[9] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.
[11] Sam Keene,et al. A Fully Convolutional Neural Network Approach to End-to-End Speech Enhancement , 2018, ArXiv.
[12] Razvan Pascanu,et al. Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[13] Eduardo Coutinho,et al. The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity & Native Language , 2016, INTERSPEECH.
[14] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Andrew Zisserman,et al. Emotion Recognition in Speech using Cross-Modal Transfer in the Wild , 2018, ACM Multimedia.
[16] Yifan Gong,et al. End-to-End attention based text-dependent speaker verification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[17] Hermann Ney,et al. Acoustic modeling with deep neural networks using raw time signal for LVCSR , 2014, INTERSPEECH.
[18] Panayiotis G. Georgiou,et al. Multimodal Fusion of Multirate Acoustic, Prosodic, and Lexical Speaker Characteristics for Native Language Identification , 2016, INTERSPEECH.
[19] Kandarpa Kumar Sarma,et al. Emotion Identification from Raw Speech Signals Using DNNs , 2018, INTERSPEECH.
[20] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[21] Koichi Shinoda,et al. Attentive Statistics Pooling for Deep Speaker Embedding , 2018, INTERSPEECH.
[22] Yongqiang Wang,et al. Towards End-to-end Spoken Language Understanding , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Sanjeev Khudanpur,et al. Spoken Language Recognition using X-vectors , 2018, Odyssey.
[25] Tara N. Sainath,et al. Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Klaus Zechner,et al. Adapting the acoustic model of a speech recognizer for varied proficiency non-native spontaneous speech using read speech with language-specific pronunciation difficulty , 2009, INTERSPEECH.
[27] Tara N. Sainath,et al. Multi-Dialect Speech Recognition with a Single Sequence-to-Sequence Model , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] David Suendermann-Oeft,et al. Exploring ASR-free end-to-end modeling to improve spoken language understanding in a cloud-based dialog system , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[29] Yu Tsao,et al. Temporal Attentive Pooling for Acoustic Event Detection , 2018, INTERSPEECH.
[30] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[31] Keelan Evanini,et al. Exploring End-To-End Attention-Based Neural Networks For Native Language Identification , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[32] Iasonas Kokkinos,et al. Learning Filterbanks from Raw Speech for Phone Recognition , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] Chao Huang,et al. Accent modeling based on pronunciation dictionary adaptation for large vocabulary Mandarin speech recognition , 2000, INTERSPEECH.
[34] Yoshua Bengio,et al. Speaker Recognition from Raw Waveform with SincNet , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[35] Róbert Busa-Fekete,et al. Determining Native Language and Deception Using Phonetic Features and Classifier Combination , 2016, INTERSPEECH.
[36] Wei Dai,et al. Very deep convolutional neural networks for raw waveforms , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[37] Ramón Fernández Astudillo,et al. Exploiting Phone Log-Likelihood Ratio Features for the Detection of the Native Language of Non-Native English Speakers , 2016, INTERSPEECH.
[38] Avni Rajpal,et al. Native Language Identification Using Spectral and Source-Based Features , 2016, INTERSPEECH.
[39] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.