An Experimental Study on the Significance of Variable Frame-Length and Overlap in the Context of Children’s Speech Recognition
暂无分享,去创建一个
S. Shahnawazuddin | Gayadhar Pradhan | Hemant Kumar Kathania | Waquar Ahmad | Chaman Singh | G. Pradhan | S. Shahnawazuddin | H. Kathania | Waquar Ahmad | Chaman Singh | Gayadhar Pradhan
[1] Ronald A. Cole,et al. Highly accurate children's speech recognition for interactive reading tutors using subword units , 2007, Speech Commun..
[2] Syed Shahnawazuddin,et al. Pitch-Normalized Acoustic Features for Robust Children's Speech Recognition , 2017, IEEE Signal Processing Letters.
[3] Harald Singer,et al. Pitch dependent phone modelling for HMM based speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[4] Syed Shahnawazuddin,et al. Sparse coding over redundant dictionaries for fast adaptation of speech recognition system , 2017, Comput. Speech Lang..
[5] Francoise Beaufays,et al. “Your Word is my Command”: Google Search by Voice: A Case Study , 2010 .
[6] Shweta Ghai,et al. Addressing pitch Mismatch for Children's Automatic Speech Recognition , 2011 .
[7] Lonce L. Wyse,et al. Real-Time Signal Estimation From Modified Short-Time Fourier Transform Magnitude Spectra , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[8] Raymond D. Kent,et al. Speech segment durations in sentence recitations by children and adults , 1980 .
[9] Shrikanth S. Narayanan,et al. Analysis of children's speech: duration, pitch and formants , 1997, EUROSPEECH.
[10] Elmar Nöth,et al. Acoustic normalization of children's speech , 2003, INTERSPEECH.
[11] Richard M. Stern,et al. On the effects of speech rate in large vocabulary speech recognition systems , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[12] Kai Feng,et al. The subspace Gaussian mixture model - A structured model for speech recognition , 2011, Comput. Speech Lang..
[13] Eric Fosler-Lussier,et al. Towards robustness to fast speech in ASR , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[14] Daniel Povey,et al. Speaking rate adaptation using continuous frame rate normalization , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[15] J. L. Miller,et al. Effect of speaking rate on the perceptual structure of a phonetic category , 1989, Perception & psychophysics.
[16] Tara N. Sainath,et al. Large vocabulary automatic speech recognition for children , 2015, INTERSPEECH.
[17] Xu Shao,et al. Pitch prediction from MFCC vectors for speech reconstruction , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[18] Shrikanth S. Narayanan,et al. A review of ASR technologies for children's speech , 2009, WOCCI.
[19] Eric Fosler-Lussier,et al. Speech recognition using on-line estimation of speaking rate , 1997, EUROSPEECH.
[20] Mark A. Fanty,et al. Rapid unsupervised adaptation to children's speech on a connected-digit task , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[21] Lonce Wyse,et al. AN EFFICIENT ALGORITHM FOR REAL-TIME SPECTROGRAM INVERSION , 2005 .
[22] Xiaohui Zhang,et al. Improving deep neural network acoustic models using generalized maxout networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[24] Martin J. Russell,et al. Challenges for computer recognition of children2s speech , 2007, SLaTE.
[25] Daniel L. Valente,et al. Experimental investigation of the effects of the acoustical conditions in a simulated classroom on speech recognition and learning in children. , 2012, The Journal of the Acoustical Society of America.
[26] Diego Giuliani,et al. Deep-neural network approaches for speech recognition with heterogeneous groups of speakers including children† , 2016, Natural Language Engineering.
[27] Jay G. Wilpon,et al. A study of speech recognition for children and the elderly , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[28] Sree Hari Krishnan Parthasarathi,et al. fMLLR based feature-space speaker adaptation of DNN acoustic models , 2015, INTERSPEECH.
[29] Abeer Alwan,et al. Entropy-based variable frame rate analysis of speech signals and its application to ASR , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[30] Q. Summerfield. Articulatory rate and perceptual constancy in phonetic perception. , 1981, Journal of experimental psychology. Human perception and performance.
[31] Shrikanth S. Narayanan,et al. Robust recognition of children's speech , 2003, IEEE Trans. Speech Audio Process..
[32] Syed Shahnawazuddin,et al. Pitch-Adaptive Front-End Features for Robust Children's ASR , 2016, INTERSPEECH.
[33] Steve Renals,et al. WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[34] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[35] Li Lee,et al. A frequency warping approach to speaker normalization , 1998, IEEE Trans. Speech Audio Process..
[36] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.
[37] Bryan L. Pellom,et al. Children's speech recognition with application to interactive books and tutors , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).
[38] Luís C. Oliveira,et al. Pitch-synchronous time-scaling for prosodic and voice quality transformations , 2005, INTERSPEECH.
[39] Syed Shahnawazuddin,et al. Enhancing noise and pitch robustness of children's ASR , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[40] Shrikanth S. Narayanan,et al. Acoustics of children's speech: developmental changes of temporal and spectral parameters. , 1999, The Journal of the Acoustical Society of America.
[41] Zheng-Hua Tan,et al. Low-Complexity Variable Frame Rate Analysis for Speech Recognition and Voice Activity Detection , 2010, IEEE Journal of Selected Topics in Signal Processing.
[42] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[43] Sandra P. Whiteside,et al. Speech patterns of children and adults elicited via a picture-naming task: An acoustic study , 2000, Speech Commun..
[44] Daniel Elenius,et al. The PF_STAR children's speech corpus , 2005, INTERSPEECH.