Generative Modeling of Pseudo-Whisper for Robust Whispered Speech Recognition
暂无分享,去创建一个
John H. L. Hansen | Hynek Boril | Shabnam Ghaffarzadegan | J. Hansen | H. Boril | Shabnam Ghaffarzadegan
[1] Philip C. Woodland,et al. Experiments in speaker normalisation and adaptation for large vocabulary speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[2] Boon Pang Lim,et al. Computational differences between whispered and non-whispered speech , 2011 .
[3] John H. L. Hansen,et al. Unsupervised equalization of Lombard effect for speech recognition in noisy adverse environment , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[4] John H. L. Hansen,et al. UT-Vocal Effort II: Analysis and constrained-lexicon recognition of whispered speech , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Richard M. Stern,et al. A vector Taylor series approach for environment-independent speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[6] Herbert Gish,et al. A parametric approach to vocal tract length normalization , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[7] Victor Zue,et al. Speech database development at MIT: Timit and beyond , 1990, Speech Commun..
[8] John H. L. Hansen,et al. Acoustic analysis for speaker identification of whispered speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[9] D. B. Paul. A speaker-stress resistant HMM isolated word recognizer , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[10] John H. L. Hansen,et al. Acoustic analysis and feature transformation from neutral to whisper for speaker identification within whispered speech audio streams , 2013, Speech Commun..
[11] Thomas Hofmann,et al. Greedy Layer-Wise Training of Deep Networks , 2007 .
[12] Dorde T. Grozdic,et al. Application of inverse filtering in enhancement of whisper recognition , 2014, 12th Symposium on Neural Network Applications in Electrical Engineering (NEUREL).
[13] Mark A. Clements,et al. Reconstruction of speech from whispers , 2002, MAVEBA.
[14] John H. L. Hansen,et al. Model and feature based compensation for whispered speech recognition , 2014, INTERSPEECH.
[15] Nathalie Henrich,et al. Speaking in noise: How does the Lombard effect improve acoustic contrasts between speech and ambient noise? , 2014, Comput. Speech Lang..
[16] Mark J. F. Gales,et al. Variance compensation within the MLLR framework for robust speech recognition and speaker adaptation , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[17] John H. L. Hansen,et al. Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition , 1996, Speech Commun..
[18] John H. L. Hansen,et al. Morphological constrained feature enhancement with adaptive cepstral compensation (MCE-ACC) for speech recognition in noise and Lombard effect , 1994, IEEE Trans. Speech Audio Process..
[19] John H. L. Hansen,et al. Advancements in whisper-island detection within normally phonated audio streams , 2009, INTERSPEECH.
[20] John H. L. Hansen,et al. Speaker Identification Within Whispered Speech Audio Streams , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[21] John H. L. Hansen,et al. Lombard effect compensation for robust automatic speech recognition in noise , 1990, ICSLP.
[22] Rajesh M. Hegde,et al. Significance of parametric spectral ratio methods in detection and recognition of whispered speech , 2012, EURASIP J. Adv. Signal Process..
[23] Bin Ma,et al. A whispered Mandarin corpus for speech technology applications , 2014, INTERSPEECH.
[24] Li Deng,et al. HMM adaptation using vector taylor series for noisy speech recognition , 2000, INTERSPEECH.
[25] Pedro J. Moreno,et al. Speech recognition in noisy environments , 1996 .
[26] Kazuya Takeda,et al. Acoustic analysis and recognition of whispered speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[27] Jonas Beskow,et al. Wavesurfer - an open source speech tool , 2000, INTERSPEECH.
[28] H. Traunmüller,et al. Dept. for Speech, Music and Hearing Quarterly Progress and Status Report a Comparative Study of the Male and Female Whispered and Phonated Versions of the Long Vowels of Swedish , 2022 .
[29] Maëva Garnier. Communicating in noisy environments : from adaptation to vocal loading , 2007 .
[30] I. Mcloughlin,et al. A comprehensive vowel space for whispered speech. , 2012, Journal of voice : official journal of the Voice Foundation.
[31] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..
[32] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[33] Hideki Kasuya,et al. Acoustic nature of the whisper , 1999, EUROSPEECH.
[34] Kazuya Takeda,et al. Acoustic analysis and recognition of whispered speech , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..
[35] Li Lee,et al. Speaker normalization using efficient frequency warping procedures , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[36] Kazuya Takeda,et al. Analysis and recognition of whispered speech , 2005, Speech Commun..
[37] John H. L. Hansen,et al. Generative modeling of pseudo-target domain adaptation samples for whispered speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[38] John H. L. Hansen,et al. A comparative study of traditional and newly proposed features for recognition of speech under stress , 2000, IEEE Trans. Speech Audio Process..
[39] Tetsuji Ogawa,et al. Influence of Lombard Effect: Accuracy Analysis of Simulation-Based Assessments of Noisy Speech Recognition Systems for Various Recognition Conditions , 2009, IEICE Trans. Inf. Syst..
[40] Chi Zhang,et al. Microphone array processing for distance speech capture: A probe study on whisper speech detection , 2010, 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers.
[41] Liang Lu,et al. Noise-robust whispered speech recognition using a non-audible-murmur microphone with VTS compensation , 2012, 2012 8th International Symposium on Chinese Spoken Language Processing.
[42] Tanja Schultz,et al. Whispery speech recognition using adapted articulatory features , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[43] Carlos Busso,et al. Lipreading approach for isolated digits recognition under whisper and neutral speech , 2014, INTERSPEECH.
[44] Yasuo Horiuchi,et al. Reverberant speech recognition based on denoising autoencoder , 2013, INTERSPEECH.
[45] Paul Boersma,et al. Praat, a system for doing phonetics by computer , 2002 .
[46] Jeesun Kim,et al. Comparing the consistency and distinctiveness of speech produced in quiet and in noise , 2014, Comput. Speech Lang..
[47] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.
[48] John H. L. Hansen,et al. UTDrive: Emotion and Cognitive Load Classification for In-Vehicle Scenarios , 2011 .
[49] James R. Glass,et al. Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[50] John H. L. Hansen,et al. N-channel hidden Markov models for combined stressed speech classification and recognition , 1999, IEEE Trans. Speech Audio Process..
[51] D G Childers,et al. Vocal quality factors: analysis, synthesis, and perception. , 1991, The Journal of the Acoustical Society of America.
[52] W. Heeren,et al. Perception of prosody in normal and whispered French. , 2014, The Journal of the Acoustical Society of America.
[53] B. Atal. Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. , 1974, The Journal of the Acoustical Society of America.
[54] Martin Cooke,et al. Spectral and temporal changes to speech produced in the presence of energetic and informational maskers. , 2010, The Journal of the Acoustical Society of America.
[55] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[56] Evandro B. Gouvêa,et al. Speaker normalization through formant-based warping of the frequency scale , 1997, EUROSPEECH.
[57] John H. L. Hansen,et al. ICARUS: Source generator based real-time recognition of speech in noisy stressful and Lombard effect environments , 1995, Speech Commun..
[58] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .