Speech/Non-Speech Detection in Meetings from Automatically Extracted low Resolution Visual Features
暂无分享,去创建一个
[1] Rashid Ansari,et al. Multimodal human discourse: gesture and speech , 2002, TCHI.
[2] Ben J. A. Kröse,et al. On-line multi-modal speaker diarization , 2007, ICMI '07.
[3] D. McNeill. Language and Gesture: Gesture in action , 2000 .
[4] Jean-Marc Odobez,et al. Visual activity context for focus of attention estimation in dynamic meetings , 2009, 2009 IEEE International Conference on Multimedia and Expo.
[5] Dirk Heylen,et al. in head orientation between speakers and listeners in multi-party conversations , 2005 .
[6] Douglas A. Reynolds,et al. Approaches and applications of audio diarization , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[7] J. Odobez,et al. A Rao-Blackwellized Mixed State Particle Filter for Head Pose Tracking , 2005 .
[8] Alessandro Vinciarelli,et al. Role recognition in multiparty recordings using social affiliation networks and discrete distributions , 2008, ICMI '08.
[9] Alejandro Jaimes. Posture and activity silhouettes for self-reporting, interruption management, and attentive interfaces , 2006, IUI '06.
[10] Harriet J. Nock,et al. Speaker Localisation Using Audio-Visual Synchrony: An Empirical Study , 2003, CIVR.
[11] Chuohao Yeo,et al. Multi-modal speaker diarization of real-world meetings using compressed-domain video features , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[12] Shaogang Gong,et al. Modelling facial colour and identity with Gaussian mixtures , 1998, Pattern Recognit..
[13] Guy J. Brown,et al. Speech and crosstalk detection in multichannel audio , 2005, IEEE Transactions on Speech and Audio Processing.
[14] Jithendra Vepa,et al. The segmentation of multi-channel meeting recordings for automatic speech recognition , 2006, INTERSPEECH.
[15] Louis-Philippe Morency,et al. Predicting Listener Backchannels: A Probabilistic Multimodal Approach , 2008, IVA.
[16] Trevor Darrell,et al. A multi-modal approach for determining speaker location and focus , 2003, ICMI '03.
[17] Chuohao Yeo,et al. Compressed domain video processing of meetings for activity estimation in dominance classification and slide transition detection , 2008 .
[18] Chuohao Yeo,et al. Associating audio-visual activity cues in a dominance estimation framework , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.
[19] John W. Fisher,et al. Dynamic Dependency Tests for Audio-Visual Speaker Association , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[20] Christian A. Müller,et al. A fast-match approach for robust, faster than real-time speaker diarization , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).
[21] Chuohao Yeo,et al. Modeling Dominance in Group Conversations Using Nonverbal Activity Cues , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[22] Sudeep Sarkar,et al. Audio Segmentation and Speaker Localization in Meeting Videos , 2006, 18th International Conference on Pattern Recognition (ICPR'06).