Data Augmentation for Deep Neural Network Acoustic Modeling
暂无分享,去创建一个
Xiaodong Cui | Brian Kingsbury | Vaibhava Goel | Brian Kingsbury | Vaibhava Goel | Xiaodong Cui | Xiaodong Cui | V. Goel | Xiaodong Cui | Brian Kingsbury
[1] Li Lee,et al. A frequency warping approach to speaker normalization , 1998, IEEE Trans. Speech Audio Process..
[2] Hermann Ney,et al. Data augmentation, feature combination, and multilingual neural networks to improve ASR and KWS performance for low-resource languages , 2014, INTERSPEECH.
[3] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..
[4] Xiaodong Cui,et al. Developing speech recognition systems for corpus indexing under the IARPA Babel program , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[5] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[6] Li Deng,et al. Large-vocabulary speech recognition under adverse acoustic environments , 2000, INTERSPEECH.
[7] M. Hunt,et al. Distance measures for speech recognition , 1989 .
[8] Xiaodong Cui,et al. A high-performance Cantonese keyword search system , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[9] Xiaodong Cui,et al. System combination and score normalization for spoken term detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[10] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[11] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[12] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[13] John R. Hershey,et al. Hidden Markov Acoustic Modeling With Bootstrap and Restructuring for Low-Resourced Languages , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[14] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..
[15] Richard Sproat,et al. Lattice-Based Search for Spoken Utterance Retrieval , 2004, NAACL.
[16] Javed A. Aslam,et al. Relevance score normalization for metasearch , 2001, CIKM '01.
[17] Tara N. Sainath,et al. Improvements to Deep Convolutional Neural Networks for LVCSR , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[18] Xiaodong Cui,et al. Data augmentation for deep convolutional neural network acoustic modeling , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..
[20] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[21] Mark J. F. Gales,et al. Speech recognition and keyword spotting for low-resource languages: Babel project research at CUED , 2014, SLTU.
[22] Brian Kingsbury,et al. The IBM Attila speech recognition toolkit , 2010, 2010 IEEE Spoken Language Technology Workshop.
[23] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[24] Tara N. Sainath,et al. Joint training of convolutional and non-convolutional neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[26] Xiaodong Cui,et al. Data Augmentation for deep neural network acoustic modeling , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Michael Picheny,et al. Noise robustness in speech to speech translation , 2003, INTERSPEECH.
[28] Jonathan G. Fiscus,et al. Results of the 2006 Spoken Term Detection Evaluation , 2006 .
[29] Mark J. F. Gales,et al. Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..
[30] Tara N. Sainath,et al. Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization , 2012, INTERSPEECH.
[31] Navdeep Jaitly,et al. Vocal Tract Length Perturbation (VTLP) improves speech recognition , 2013 .
[32] Dong Yu,et al. Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[33] Mark J. F. Gales,et al. Investigation of unsupervised adaptation of DNN acoustic models with filter bank input , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] E. A. Martin,et al. Multi-style training for robust isolated-word speech recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[35] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.
[36] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[37] Naoyuki Kanda,et al. Elastic spectral distortion for low resource speech recognition with deep neural networks , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[38] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[39] Mark J. F. Gales,et al. Data augmentation for low resource languages , 2014, INTERSPEECH.
[40] Satoshi Nakamura,et al. Voice conversion through vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[41] Xiaodong Cui,et al. Improving deep neural network acoustic modeling for audio corpus indexing under the IARPA babel program , 2014, INTERSPEECH.
[42] Jan Cernocký,et al. But neural network features for spontaneous Vietnamese in BABEL , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).