Creating speaker independent ASR system through prosody modification based data augmentation
暂无分享,去创建一个
S. Shahnawazuddin | Nagaraj Adiga | B. Tarun Sai | Hemant Kumar Kathania | S. Shahnawazuddin | Nagaraj Adiga | H. Kathania | B. Sai | B. T. Sai
[1] I. Hirsh,et al. Development of speech sounds in children. , 1969, Acta oto-laryngologica. Supplementum.
[2] Daniel Elenius,et al. The PF_STAR children's speech corpus , 2005, INTERSPEECH.
[3] Paul Deléglise,et al. TED-LIUM: an Automatic Speech Recognition dedicated corpus , 2012, LREC.
[4] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[5] Sanjeev Khudanpur,et al. A study on data augmentation of reverberant speech for robust speech recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Diego Giuliani,et al. Deep-neural network approaches for speech recognition with heterogeneous groups of speakers including children† , 2016, Natural Language Engineering.
[7] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Steve Renals,et al. WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[9] Syed Shahnawazuddin,et al. Assessment of pitch-adaptive front-end signal processing for children's speech recognition , 2018, Comput. Speech Lang..
[10] Hynek Hermansky,et al. Robust speech recognition in unknown reverberant and noisy conditions , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[11] Syed Shahnawazuddin,et al. Pitch-Normalized Acoustic Features for Robust Children's Speech Recognition , 2017, IEEE Signal Processing Letters.
[12] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[13] Syed Shahnawazuddin,et al. Effect of Prosody Modification on Children's ASR , 2017, IEEE Signal Processing Letters.
[14] Francoise Beaufays,et al. “Your Word is my Command”: Google Search by Voice: A Case Study , 2010 .
[15] Shrikanth S. Narayanan,et al. Acoustics of children's speech: developmental changes of temporal and spectral parameters. , 1999, The Journal of the Acoustical Society of America.
[16] Martin J. Russell,et al. Challenges for computer recognition of children2s speech , 2007, SLaTE.
[17] K. T. Deepak,et al. Speech and EGG polarity detection using Hilbert Envelope , 2015, TENCON 2015 - 2015 IEEE Region 10 Conference.
[18] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[19] Bayya Yegnanarayana,et al. Epoch Extraction From Speech Signals , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[20] Dong Yu,et al. Automatic Speech Recognition: A Deep Learning Approach , 2014 .
[21] Shrikanth S. Narayanan,et al. A review of ASR technologies for children's speech , 2009, WOCCI.
[22] B. Yegnanarayana,et al. Fast prosody modification using instants of significant excitation , 2010 .
[23] Raymond D. Kent,et al. Anatomical and neuromuscular maturation of the speech mechanism: evidence from acoustic studies. , 1976, Journal of speech and hearing research.
[24] Tara N. Sainath,et al. Large vocabulary automatic speech recognition for children , 2015, INTERSPEECH.
[25] Sanjeev Khudanpur,et al. JHU ASpIRE system: Robust LVCSR with TDNNS, iVector adaptation and RNN-LMS , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[26] Bayya Yegnanarayana,et al. Determination of instants of significant excitation in speech using group delay function , 1995, IEEE Trans. Speech Audio Process..
[27] Navdeep Jaitly,et al. Vocal Tract Length Perturbation (VTLP) improves speech recognition , 2013 .
[28] Elmar Nöth,et al. Acoustic normalization of children's speech , 2003, INTERSPEECH.
[29] Diego Giuliani,et al. Vocal tract length normalisation approaches to DNN-based children's and adults' speech recognition , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[30] E. A. Martin,et al. Multi-style training for robust isolated-word speech recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[31] Jan Cernocký,et al. Improved feature processing for deep neural networks , 2013, INTERSPEECH.