Unsupervised extraction of audio-visual objects
暂无分享,去创建一个
[1] Javier R. Movellan,et al. Audio Vision: Using Audio-Visual Synchrony to Locate Sounds , 1999, NIPS.
[2] Pierre Vandergheynst,et al. Nonlinear Video Diffusion based on Audio-Video Synchrony , 2010 .
[3] Jian Sun,et al. Video object cut and paste , 2005, SIGGRAPH 2005.
[4] Sabri Gurbuz,et al. Moving-Talker, Speaker-Independent Feature Study, and Baseline Results Using the CUAVE Multimodal Speech Corpus , 2002, EURASIP J. Adv. Signal Process..
[5] Pierre Vandergheynst,et al. Blind Audiovisual Source Separation Based on Sparse Redundant Representations , 2010, IEEE Transactions on Multimedia.
[6] Marie-Pierre Jolly,et al. Interactive Graph Cuts for Optimal Boundary and Region Segmentation of Objects in N-D Images , 2001, ICCV.
[7] Andrew Blake,et al. "GrabCut" , 2004, ACM Trans. Graph..
[8] Michael Elad,et al. Cross-Modal Localization via Sparsity , 2007, IEEE Transactions on Signal Processing.
[9] Trevor Darrell,et al. Speaker association with signal-level audiovisual fusion , 2004, IEEE Transactions on Multimedia.
[10] Marie-Pierre Jolly,et al. Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.
[11] Yoichi Sato,et al. Visual localization of non-stationary sound sources , 2009, ACM Multimedia.
[12] Yoichi Sato,et al. Finding Speaker Face Region by Audiovisual Correlation , 2008 .