Video clip recognition using joint audio-visual processing model
暂无分享,去创建一个
[1] Chalapathy Neti,et al. Stream confidence estimation for audio-visual speech recognition , 2000, INTERSPEECH.
[2] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[3] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..
[4] Gerasimos Potamianos,et al. Discriminative training of HMM stream exponents for audio-visual speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[5] David G. Stork,et al. Speech recognition and sensory integration , 1998 .
[6] Giridharan Iyengar,et al. A cascade image transform for speaker independent automatic speechreading , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).
[7] Chalapathy Neti,et al. Audio-visual large vocabulary continuous speech recognition in the broadcast domain , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).
[8] Andrew W. Senior,et al. Recognizing faces in broadcast video , 1999, Proceedings International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems. In Conjunction with ICCV'99 (Cat. No.PR00378).
[9] Wolfgang Effelsberg,et al. On the detection and recognition of television commercials , 1997, Proceedings of IEEE International Conference on Multimedia Computing and Systems.
[10] Ashish Verma,et al. LATE INTEGRATION IN AUDIO-VISUAL CONTINUOUS SPEECH RECOGNITION , 1999 .
[11] Giridharan Iyengar,et al. Speaker change detection using joint audio-visual statistics , 2000, RIAO.
[12] Chalapathy Neti,et al. Audio-visual intent-to-speak detection for human-computer interaction , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[13] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.
[14] Teuvo Kohonen,et al. The self-organizing map , 1990 .
[15] Zhu Liu,et al. Multimedia content analysis-using both audio and visual clues , 2000, IEEE Signal Process. Mag..
[16] Alvin F. Martin,et al. The DET curve in assessment of detection task performance , 1997, EUROSPEECH.
[17] Benoît Maison,et al. Audio-visual speaker recognition for video broadcast news: some fusion techniques , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).