AV-SAM: Segment Anything Model Meets Audio-Visual Localization and Segmentation
暂无分享,去创建一个
[1] Yapeng Tian,et al. Audio-Visual Grouping Network for Sound Localization from Mixtures , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Shentong Mo,et al. A Closer Look at Weakly-Supervised Audio-Visual Source Localization , 2022, NeurIPS.
[3] Stan Birchfield,et al. Audio-Visual Segmentation , 2022, ECCV.
[4] Shentong Mo,et al. Localizing Visual Sounds the Easy Way , 2022, ECCV.
[5] Yapeng Tian,et al. Multi-modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing , 2022, NeurIPS.
[6] Yi Li,et al. Learning Representations from Audio-Visual Spatial Alignment , 2020, NeurIPS.
[7] Chenliang Xu,et al. Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing , 2020, ECCV.
[8] Andrew Zisserman,et al. Vggsound: A Large-Scale Audio-Visual Dataset , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Nuno Vasconcelos,et al. Self-Supervised Generation of Spatial Audio for 360 Video , 2018, NIPS 2018.
[10] Chenliang Xu,et al. Audio-Visual Event Localization in Unconstrained Videos , 2018, ECCV.
[11] Tae-Hyun Oh,et al. Learning to Localize Sound Source in Visual Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[12] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.