Comparing audio and visual information for speech processing
暂无分享,去创建一个
[1] Sridha Sridharan,et al. Robust Face Localisation Using Motion, Colour and Fusion , 2003, DICTA.
[2] Steve Young,et al. The HTK book , 1995 .
[3] Matthew R. McKay,et al. Robust Face Localisation Using Motion, Colour & Fusion , 2003 .
[4] Giridharan Iyengar,et al. A cascade image transform for speaker independent automatic speechreading , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).
[5] Farzin Deravi,et al. A review of speech-based bimodal recognition , 2002, IEEE Trans. Multim..
[6] Javier R. Movellan,et al. Dynamic Features for Visual Speechreading: A Systematic Comparison , 1996, NIPS.
[7] J.N. Gowdy,et al. CUAVE: A new audio-visual database for multimodal human-computer interface research , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[8] Sridha Sridharan,et al. Adaptive Fusion of Speech and Lip Information for Robust Speaker Identification , 2001, Digit. Signal Process..