Cascading appearance-based features for visual voice activity detection
暂无分享,去创建一个
[1] I. Boyd,et al. The voice activity detector for the Pan-European digital cellular mobile telephone service , 1988, International Conference on Acoustics, Speech, and Signal Processing,.
[2] Giridharan Iyengar,et al. A cascade image transform for speaker independent automatic speechreading , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).
[3] Maurizio Omologo,et al. Use of a CSP-based voice activity detector for distant-talking ASR , 2003, INTERSPEECH.
[4] A. Kondoz,et al. Analysis and improvement of a statistical model-based voice activity detector , 2001, IEEE Signal Processing Letters.
[5] Wonyong Sung,et al. A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.
[6] Christian Jutten,et al. An Analysis of Visual Speech Information Applied to Voice Activity Detection , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[7] Chalapathy Neti,et al. Audio-visual speech recognition in challenging environments , 2003, INTERSPEECH.
[8] Wei Zhang,et al. A soft voice activity detector based on a Laplacian-Gaussian model , 2003, IEEE Trans. Speech Audio Process..
[9] Peng Liu,et al. Voice activity detection using visual information , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[10] Paul A. Viola,et al. Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.
[11] Gerasimos Potamianos,et al. An Embedded System for In-Vehicle Visual Speech Activity Detection , 2007, 2007 IEEE 9th Workshop on Multimedia Signal Processing.
[12] Juergen Luettin,et al. Audio-Visual Automatic Speech Recognition: An Overview , 2004 .
[13] H.S. Jamadagni,et al. VAD techniques for real-time speech transmission on the Internet , 2002, 5th IEEE International Conference on High Speed Networks and Multimedia Communication (Cat. No.02EX612).
[14] Gerasimos Potamianos,et al. Lipreading Using Profile Versus Frontal Views , 2006, 2006 IEEE Workshop on Multimedia Signal Processing.
[15] J.N. Gowdy,et al. CUAVE: A new audio-visual database for multimodal human-computer interface research , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.