Finding Speaker Face Region by Audiovisual Correlation
暂无分享,去创建一个
[1] Michael Elad,et al. Pixels that sound , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[2] Pierre Vandergheynst,et al. Analysis of multimodal signals using redundant representations , 2005, IEEE International Conference on Image Processing 2005.
[3] Javier R. Movellan,et al. Audio Vision: Using Audio-Visual Synchrony to Locate Sounds , 1999, NIPS.
[4] Takeo Kanade,et al. An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.
[5] Gareth Funka-Lea,et al. Graph Cuts and Efficient N-D Image Segmentation , 2006, International Journal of Computer Vision.
[6] John W. Fisher,et al. A novel measure for independent component analysis (ICA) , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[7] E. Parzen. On Estimation of a Probability Density Function and Mode , 1962 .
[8] J. Driver. Enhancement of selective listening by illusory mislocation of speech sounds due to lip-reading , 1996, Nature.
[9] Trevor Darrell,et al. Speaker association with signal-level audiovisual fusion , 2004, IEEE Transactions on Multimedia.
[10] Pierre Vandergheynst,et al. Blind Audio-Visual Source Separation Using Sparse Redundant Representations , 2006 .
[11] Pierre Vandergheynst,et al. Blind Audiovisual Source Separation Based on Sparse Redundant Representations , 2010, IEEE Transactions on Multimedia.
[12] Sabri Gurbuz,et al. Moving-Talker, Speaker-Independent Feature Study, and Baseline Results Using the CUAVE Multimodal Speech Corpus , 2002, EURASIP J. Adv. Signal Process..
[13] Kenichi Kanatani,et al. Motion segmentation by subspace separation and model selection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.
[14] Paul A. Viola,et al. Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.
[15] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.
[16] Juergen Luettin,et al. Speaker identification by lipreading , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[17] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).