Extending Linear Dynamical Systems with Dynamic Stream Weights for Audiovisual Speaker Localization
暂无分享,去创建一个
[1] Dorothea Kolossa,et al. A newem estimationof dynamic stream weights for coupled-HMM-based audio-visual ASR , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Ning Ma,et al. Improving audio-visual speech recognition using deep neural networks with dynamic stream reliability estimates , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[4] Dorothea Kolossa,et al. Learning Dynamic Stream Weights For Coupled-HMM-Based Audio-Visual Speech Recognition , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[5] Robert M. Nickel,et al. Dynamic Stream Weighting for Turbo-Decoding-Based Audiovisual ASR , 2016, INTERSPEECH.
[6] Gerasimos Potamianos,et al. Discriminative training of HMM stream exponents for audio-visual speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[7] R. O. Schmidt,et al. Multiple emitter location and signal Parameter estimation , 1986 .
[8] Radu Horaud,et al. Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[9] Gwenn Englebienne,et al. Multimodal Speaker Diarization , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[10] Britta Wrede,et al. Computational Audiovisual Scene Analysis in Online Adaptation of Audio-Motor Maps , 2013, IEEE Transactions on Autonomous Mental Development.
[11] Boaz Rafaely,et al. Localization of Multiple Speakers under High Reverberation using a Spherical Microphone Array and the Direct-Path Dominance Test , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[12] H.K. Ekenel,et al. Kalman filters for audio-video source localization , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..
[13] Paul A. Viola,et al. Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.
[14] V. Udayashankara,et al. Automatic bimodal audiovisual speech recognition: A review , 2014, 2014 International Conference on Contemporary Computing and Informatics (IC3I).
[15] JongSuk Choi,et al. Audio-visual integration for human-robot interaction in multi-person scenarios , 2014, Proceedings of the 2014 IEEE Emerging Technology and Factory Automation (ETFA).
[16] Martin Heckmann,et al. Environmentally robust audio-visual speaker identification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[17] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .
[18] Jean-Philippe Thiran,et al. On Dynamic Stream Weighting for Audio-Visual Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Masahide Kaneko,et al. Probabilistic integration of audiovisual information to localize sound source in human-robot interaction , 2003, The 12th IEEE International Workshop on Robot and Human Interactive Communication, 2003. Proceedings. ROMAN 2003..
[20] Georges Linarès,et al. Audiovisual speaker diarization of TV series , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).