Incorporating Noise Robustness in Speech Command Recognition by Noise Augmentation of Training Data
暂无分享,去创建一个
Yousaf Bin Zikria | Naveed Khan Baloch | Farruh Ishmanov | Fawad Riasat Raja | Fawad Hussain | Huma Israr | Ayesha Pervaiz | Muhammad Ali Tahir | F. Hussain | Humayun Israr | Ayesha Pervaiz | Farruh Ishmanov | Fawad Riasat Raja | N. K. Baloch | Muhammad Ali Tahir | Y. B. Zikria
[1] Hermann Ney,et al. Mean-normalized stochastic gradient for large-scale deep learning , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Shruti Sannon,et al. "Alexa is my new BFF": Social Roles, User Satisfaction, and Personification of the Amazon Echo , 2017, CHI Extended Abstracts.
[3] Kaisheng Yao,et al. Deep neural support vector machines for speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] I. Elamvazuthi,et al. Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques , 2010, ArXiv.
[5] Tara N. Sainath,et al. Deep Convolutional Neural Networks for Large-scale Speech Tasks , 2015, Neural Networks.
[6] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Sanjeev Khudanpur,et al. A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.
[8] Philip C. Woodland,et al. Very deep convolutional neural networks for robust speech recognition , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[9] Jimmy J. Lin,et al. Deep Residual Learning for Small-Footprint Keyword Spotting , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Sercan Ömer Arik,et al. Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting , 2017, INTERSPEECH.
[11] Dong Yu,et al. An introduction to voice search , 2008, IEEE Signal Processing Magazine.
[12] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Justin Salamon,et al. Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification , 2016, IEEE Signal Processing Letters.
[14] Chunlei Zhang,et al. End-to-End Text-Independent Speaker Verification with Triplet Loss on Short Utterances , 2017, INTERSPEECH.
[15] Dong Yu,et al. Automatic Speech Recognition: A Deep Learning Approach , 2014 .
[16] Geoffrey E. Hinton,et al. Dynamic Routing Between Capsules , 2017, NIPS.
[17] H. Ney,et al. Linear discriminant analysis for improved large vocabulary continuous speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[18] Geoffrey Zweig,et al. Personalizing Model M for Voice-Search , 2011, INTERSPEECH.
[19] Vaibhava Goel,et al. Advances in Very Deep Convolutional Neural Networks for LVCSR , 2016, INTERSPEECH.
[20] Andreas G. Andreou,et al. Investigation of silicon auditory models and generalization of linear discriminant analysis for improved speech recognition , 1997 .
[21] Georg Heigold,et al. Small-footprint keyword spotting using deep neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Pete Warden,et al. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition , 2018, ArXiv.
[23] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[24] M. Picheny,et al. Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .
[25] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Will Song,et al. End-to-End Deep Neural Network for Automatic Speech Recognition , 2015 .
[27] Taghi M. Khoshgoftaar,et al. A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.
[28] Dae-Shik Kim,et al. End-to-End Speech Command Recognition with Capsule Network , 2018, INTERSPEECH.
[29] Tara N. Sainath,et al. Convolutional neural networks for small-footprint keyword spotting , 2015, INTERSPEECH.
[30] Nikko Strom,et al. Max-pooling loss training of long short-term memory networks for small-footprint keyword spotting , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[31] Luis A. Guerrero,et al. Alexa vs. Siri vs. Cortana vs. Google Assistant: A Comparison of Speech-Based Natural User Interfaces , 2017 .
[32] Geoffrey Zweig,et al. Deep Convolutional Neural Networks with Layer-Wise Context Expansion and Attention , 2016, INTERSPEECH.
[33] Yifan Gong,et al. Recurrent support vector machines for speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] Andreas Stolcke,et al. The Microsoft 2017 Conversational Speech Recognition System , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Douglas Coimbra de Andrade,et al. A neural attention model for speech command recognition , 2018, ArXiv.
[36] Bowen Zhou,et al. IBM MASTOR SYSTEM: Multilingual Automatic Speech-to-Speech Translator , 2006 .
[37] Srinivasan Umesh,et al. Improved cepstral mean and variance normalization using Bayesian framework , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[38] Yanmin Qian,et al. Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[39] Brian McMahan,et al. Listening to the World Improves Speech Command Recognition , 2017, AAAI.
[40] Yifan Gong,et al. An analysis of convolutional neural networks for speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[41] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Bin Ma,et al. Joint Application of Speech and Speaker Recognition for Automation and Security in Smart Home , 2011, INTERSPEECH.
[43] Ramesh A. Gopinath,et al. Maximum likelihood modeling with Gaussian distributions for classification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[44] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[45] Joseph Keshet,et al. SpeechYOLO: Detection and Localization of Speech Objects , 2019, INTERSPEECH.
[46] Patrick Jansson,et al. Single-word speech recognition with Convolutional Neural Networks on raw waveforms , 2018 .
[47] Sunil Kumar Kopparapu,et al. Label-Driven Time-Frequency Masking for Robust Speech Command Recognition , 2019, TSD.
[48] Olli Viikki,et al. Cepstral domain segmental feature vector normalization for noise robust speech recognition , 1998, Speech Commun..