Enhanced speaker diarization with detection of backchannels using eye-gaze information in poster conversations
暂无分享,去创建一个
Tatsuya Kawahara | Koji Inoue | Katsuya Takanashi | Hiromasa Yoshimoto | Yukoh Wakabayashi | Tatsuya Kawahara | K. Takanashi | Yukoh Wakabayashi | H. Yoshimoto | K. Inoue
[1] Hiroshi G. Okuno,et al. A Speaker Diarization System with Robust Speaker Localization and Voice Activity Detection , 2013 .
[2] Masafumi Nishida,et al. Turn-alignment using eye-gaze and speech in conversational interaction , 2010, INTERSPEECH.
[3] Junji Yamato,et al. Analysis and modeling of next speaking start timing based on gaze behavior in multi-party meetings , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Hiroshi Sawada,et al. Probabilistic Speaker Diarization With Bag-of-Words Representations of Speaker Angle Information , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[5] Gerald Friedland,et al. The ICSI RT-09 Speaker Diarization System , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[6] Tatsuya Kawahara,et al. Speaker diarization based on audio-visual integration for smart posterboard , 2014, Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific.
[7] A. Ichikawa,et al. An Analysis of Turn-Taking and Backchannels Based on Prosodic and Syntactic Features in Japanese Map Task Dialogs , 1998, Language and speech.
[8] R. O. Schmidt,et al. Multiple emitter location and signal Parameter estimation , 1986 .
[9] Seiichi Nakagawa,et al. Response Timing Detection Using Prosodic and Linguistic Information for Human-friendly Spoken Dialog Systems (論文特集:人間と共生する情報システム) , 2005 .
[10] Tatsuya Kawahara,et al. Prediction of Turn-Taking by Combining Prosodic and Eye-Gaze Information in Poster Conversations , 2012, INTERSPEECH.
[11] A. Kendon. Some functions of gaze-direction in social interaction. , 1967, Acta psychologica.
[12] Louis-Philippe Morency,et al. A probabilistic multimodal approach for predicting listener backchannels , 2009, Autonomous Agents and Multi-Agent Systems.
[13] Takeshi Yamada,et al. Detection of Overlapping Speech in Meetings Using Support Vector Machines and Support Vector Regression , 2006, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..
[14] Douglas A. Reynolds,et al. A study of new approaches to speaker diarization , 2009, INTERSPEECH.
[15] S. Duncan,et al. Some Signals and Rules for Taking Speaking Turns in Conversations , 1972 .
[16] Tatsuya Kawahara,et al. Speaker diarization using eye-gaze information in multi-party conversations , 2014, INTERSPEECH.
[17] Climent Nadeu,et al. Automatic Speech Activity Detection, Source Localization, and Speech Recognition on the Chil Seminar Corpus , 2005, 2005 IEEE International Conference on Multimedia and Expo.
[18] Jean Carletta,et al. The AMI Meeting Corpus: A Pre-announcement , 2005, MLMI.
[19] Xavier Anguera Miró,et al. Acoustic Beamforming for Speaker Diarization of Meetings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[20] Mary P. Harper,et al. VACE Multimodal Meeting Corpus , 2005, MLMI.
[21] Tatsuya Kawahara,et al. Smart posterboard: Multi-modal sensing and analysis of poster conversations , 2013, 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.
[22] Xavier Anguera Miró,et al. Speaker diarization for multiple distant microphone meetings: mixing acoustic features and inter-channel time differences , 2006, INTERSPEECH.
[23] Daniel Gatica-Perez,et al. Automatic nonverbal analysis of social interaction in small groups: A review , 2009, Image Vis. Comput..
[24] Yuichi Nakamura,et al. Cubistic Representation for Real-Time 3D Shape and Pose Estimation of Unknown Rigid Object , 2013, 2013 IEEE International Conference on Computer Vision Workshops.
[25] Douglas A. Reynolds,et al. An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[26] Athanasios Katsamanis,et al. Acoustic and Visual Cues of Turn-Taking Dynamics in Dyadic Interactions , 2011, INTERSPEECH.
[27] Tatsuya Kawahara,et al. Estimation of interest and comprehension level of audience through multi-modal behaviors in poster conversations , 2013, INTERSPEECH.
[28] S. Araki,et al. A DOA Based Speaker Diarization System for Real Meetings , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.
[29] Nigel G. Ward,et al. Prosodic features which cue back-channel responses in English and Japanese , 2000 .
[30] Louis-Philippe Morency,et al. Modeling Wisdom of Crowds Using Latent Mixture of Discriminative Experts , 2011, ACL.
[31] Tatsuya Kawahara,et al. Detection of hot spots in poster conversations based on reactive tokens of audience , 2010, INTERSPEECH.
[32] Hervé Bourlard,et al. New entropy based combination rules in HMM/ANN multi-stream ASR , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[33] Peter Wittenburg,et al. Speaker diarization using gesture and speech , 2014, INTERSPEECH.