Speaker localization for microphone array-based ASR: the effects of accuracy on overlapping speech
暂无分享,去创建一个
[1] Iain McCowan,et al. Microphone array speech recognition: experiments on overlapping speech in meetings , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[2] Satoshi Nakamura,et al. Detection and separation of speech segment using audio and video information fusion , 2003, INTERSPEECH.
[3] J. Karam,et al. Methods in Nucleic Acids Research , 1990 .
[4] G A Petsko,et al. Chemistry and biology. , 2000, Proceedings of the National Academy of Sciences of the United States of America.
[5] Josef Švejcar,et al. Péče o dítě. , 1991 .
[6] Steve Renals,et al. WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[7] J. Parker,et al. Clinical PET and PET/CT. , 2005 .
[8] Pavel Pavlovský,et al. Soudní psychiatrie a psychologie , 2004 .
[9] Jan Hugo,et al. Velký lékařský slovník. , 2002 .
[10] Andrew Zisserman,et al. Multiple view geometry in computer visiond , 2001 .
[11] Jean-Marc Odobez,et al. Multimodal multispeaker probabilistic tracking in meetings , 2005, ICMI '05.
[12] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..
[13] J. Foote,et al. WSJCAM0: A BRITISH ENGLISH SPEECH CORPUS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION , 1995 .
[14] John W. McDonough,et al. A joint particle filter for audio-visual speaker tracking , 2005, ICMI '05.
[15] Zdeněk Fišar,et al. Vybrané kapitoly z biologické psychiatrie , 2001 .
[16] James L. Crowley,et al. Multi-modal tracking of faces for video communications , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[17] John W. McDonough,et al. Microphone Array Driven Speech Recognition: Influence of Localization on the Word Error Rate , 2005, MLMI.
[18] M. Schneider,et al. Introduction to Public Health , 1988 .
[19] Daniel Gatica-Perez,et al. Speech Acquisition in Meetings with an Audio-Visual Sensor Array , 2005, 2005 IEEE International Conference on Multimedia and Expo.
[20] Naoyuki Ichimura,et al. Detection and Separation of Speech Event Using Audio and Video Information Fusion and Its Application to Robust Speech Interface , 2004, EURASIP J. Adv. Signal Process..
[21] Chin-Hui Lee,et al. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..
[22] Anoop Gupta,et al. Distributed meetings: a meeting capture and broadcasting system , 2002, MULTIMEDIA '02.
[23] R. Shulman,et al. Enteral and parenteral nutrition. , 2002 .
[24] Lukás Burget,et al. The AMI System for the Transcription of Speech in Meetings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[25] Andreas Stolcke,et al. Observations on overlap: findings and implications for automatic processing of multi-party conversation , 2001, INTERSPEECH.
[26] Emanuel a spol. Nečas,et al. Obecná patologická fyziologie , 2000 .
[27] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .
[28] Iain McCowan,et al. A sector-based approach for localization of multiple speakers with microphone arrays , 2004, SAPA@INTERSPEECH.