Deconfounded Video Moment Retrieval with Causal Intervention
暂无分享,去创建一个
Meng Wang | Fuli Feng | Xun Yang | Tat-Seng Chua | Wei Ji | Meng Wang | Xun Yang | Wei Ji | Fuli Feng | Tat-Seng Chua | Tat-seng Chua
[1] Ali Farhadi,et al. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding , 2016, ECCV.
[2] Richang Hong,et al. Learning to Compose and Reason with Language Tree Structures for Visual Grounding , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] Qi Tian,et al. Cross-modal Moment Localization in Videos , 2018, ACM Multimedia.
[4] D. Blei,et al. Causal Inference for Recommender Systems , 2020, RecSys.
[5] Meng Wang,et al. Harvesting visual concepts for image search with complex queries , 2012, ACM Multimedia.
[6] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[7] Ah-Hwee Tan,et al. Discovering and Exploiting Causal Dependencies for Robust Mobile Context-Aware Recommenders , 2007, IEEE Transactions on Knowledge and Data Engineering.
[8] Hao Zhang,et al. Span-based Localizing Network for Natural Language Video Localization , 2020, ACL.
[9] Tao Mei,et al. To Find Where You Talk: Temporal Sentence Localization in Video with Attention Based Location Regression , 2018, AAAI.
[10] Bohyung Han,et al. Local-Global Video-Text Interactions for Temporal Grounding , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Jiebo Luo,et al. Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language , 2019, AAAI.
[12] Tomoko Ohkuma,et al. Unbiased Learning for the Causal Effect of Recommendation , 2020, RecSys.
[13] Zhiwei Xiong,et al. Dual Path Interaction Network for Video Moment Localization , 2020, ACM Multimedia.
[14] Meng Wang,et al. Learning Visual Semantic Relationships for Efficient Visual Retrieval , 2015, IEEE Transactions on Big Data.
[15] Jinhui Tang,et al. Causal Intervention for Weakly-Supervised Semantic Segmentation , 2020, NeurIPS.
[16] Yitian Yuan,et al. Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[17] Hanwang Zhang,et al. Visual Commonsense R-CNN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Vladimir Vapnik,et al. An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.
[19] Meng Jian,et al. Weakly-Supervised Video Object Grounding by Exploring Spatio-Temporal Contexts , 2020, ACM Multimedia.
[20] Bing Deng,et al. The Blessings of Unlabeled Background in Untrimmed Videos , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Esa Rahtu,et al. Uncovering Hidden Challenges in Query-Based Video Moment Retrieval , 2020, BMVC.
[22] Mélanie Frappier,et al. The Book of Why: The New Science of Cause and Effect , 2018, Science.
[23] Pierre Baldi,et al. The dropout learning algorithm , 2014, Artif. Intell..
[24] Juan Carlos Niebles,et al. Dense-Captioning Events in Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[25] Runhao Zeng,et al. Dense Regression Network for Video Grounding , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Ramakant Nevatia,et al. TALL: Temporal Activity Localization via Language Query , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[27] Hanwang Zhang,et al. Interventional Few-Shot Learning , 2020, NeurIPS.
[28] Maria L. Rizzo,et al. Measuring and testing dependence by correlation of distances , 2007, 0803.4101.
[29] Larry S. Davis,et al. MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Meng Liu,et al. Attentive Moment Retrieval in Videos , 2018, SIGIR.
[31] Xirong Li,et al. Predicting Visual Features From Text for Image and Video Caption Retrieval , 2017, IEEE Transactions on Multimedia.
[32] Xiao-Yang Liu,et al. Jointly Cross- and Self-Modal Graph Attention Network for Query-Based Moment Localization , 2020, ACM Multimedia.
[33] Trevor Darrell,et al. Localizing Moments in Video with Temporal Language , 2018, EMNLP.
[34] Vighnesh Birodkar,et al. Unsupervised Learning of Disentangled Representations from Video , 2017, NIPS.
[35] Xiangnan He,et al. Clicks can be Cheating: Counterfactual Recommendation for Mitigating Clickbait Issue , 2020, SIGIR.
[36] Jingwen Wang,et al. Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction , 2020, AAAI.
[37] Trevor Darrell,et al. Localizing Moments in Video with Natural Language , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[38] Lin Ma,et al. Temporally Grounding Natural Sentence in Video , 2018, EMNLP.
[39] Anton van den Hengel,et al. On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law , 2020, NeurIPS.
[40] Zhou Zhao,et al. Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos , 2019, SIGIR.
[41] Meng Wang,et al. Person Re-Identification With Metric Learning Using Privileged Information , 2018, IEEE Transactions on Image Processing.
[42] Tat-Seng Chua,et al. Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval , 2020, SIGIR.
[43] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[44] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[45] Rui Qiao,et al. Interventional Video Grounding with Dual Contrastive Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[48] Hanwang Zhang,et al. Two Causal Principles for Improving Visual Dialog , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[49] J. Pearl,et al. Causal Inference in Statistics: A Primer , 2016 .
[50] Xiangnan He,et al. Should Graph Convolution Trust Neighbors? A Simple Causal Inference Method , 2020, SIGIR.
[51] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.