Improving Video Retrieval by Adaptive Margin
暂无分享,去创建一个
Yajuan Lü | Xiao Tan | Feng He | Zhifan Feng | Wenbin Jiang | Qi Wang | Yong Zhu | Yong Zhu | Wenbin Jiang | Yajuan Lü | Zhifan Feng | Feng He | Qi Wang | Xiao Tan
[1] Shiliang Pu,et al. Learning Incremental Triplet Margin for Person Re-identification , 2018, AAAI.
[2] Wei Wang,et al. A Comprehensive Survey on Cross-modal Retrieval , 2016, ArXiv.
[3] Lior Wolf,et al. Associating neural word embeddings with deep image representations using Fisher Vectors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[5] Cees Snoek,et al. Composite Concept Discovery for Zero-Shot Video Event Detection , 2014, ICMR.
[6] Gunhee Kim,et al. A Joint Sequence Fusion Model for Video Question Answering and Retrieval , 2018, ECCV.
[7] Yang Liu,et al. Use What You Have: Video retrieval using representations from collaborative experts , 2019, BMVC.
[8] Amit K. Roy-Chowdhury,et al. Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval , 2018, ICMR.
[9] Tat-Seng Chua,et al. Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval , 2020, SIGIR.
[10] David J. Fleet,et al. VSE++: Improving Visual-Semantic Embeddings with Hard Negatives , 2017, BMVC.
[11] Hossein Mobahi,et al. Self-Distillation Amplifies Regularization in Hilbert Space , 2020, NeurIPS.
[12] Xirong Li,et al. Predicting Visual Features From Text for Image and Video Caption Retrieval , 2017, IEEE Transactions on Multimedia.
[13] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[14] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Kilian Q. Weinberger,et al. Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.
[16] Shizhe Chen,et al. Fine-Grained Video-Text Retrieval With Hierarchical Graph Reasoning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] David Semedo,et al. Cross-Modal Subspace Learning with Scheduled Adaptive Margin Constraints , 2019, ACM Multimedia.
[18] James Allan,et al. Zero-shot video retrieval using content and concepts , 2013, CIKM.
[19] Shuai Zhang,et al. Symmetric Metric Learning with Adaptive Margin for Recommendation , 2020, AAAI.
[20] Ivan Laptev,et al. HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[21] Heeyoul Choi,et al. Self-Knowledge Distillation in Natural Language Processing , 2019, RANLP.
[22] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[23] Slobodan Ilic,et al. 3D object instance recognition and pose estimation using triplet loss with dynamic margin , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[24] Ivan Laptev,et al. Learning a Text-Video Embedding from Incomplete and Heterogeneous Data , 2018, ArXiv.
[25] Tao Mei,et al. MSR-VTT: A Large Video Description Dataset for Bridging Video and Language , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[27] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Chen Sun,et al. Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification , 2017, ECCV.
[29] Juan Carlos Niebles,et al. Dense-Captioning Events in Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[30] Yi Yang,et al. Semantic Concept Discovery for Large-Scale Zero-Shot Event Detection , 2015, IJCAI.
[31] Bernt Schiele,et al. A dataset for Movie Description , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Jongwook Choi,et al. End-to-End Concept Word Detection for Video Captioning, Retrieval, and Question Answering , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Gim Hee Lee,et al. CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[34] Iryna Gurevych,et al. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.
[35] Bowen Zhang,et al. Cross-Modal and Hierarchical Modeling of Video and Text , 2018, ECCV.
[36] Dima Damen,et al. Fine-Grained Action Retrieval Through Multiple Parts-of-Speech Embeddings , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[37] Kaisheng Ma,et al. Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[38] Chen Sun,et al. Multi-modal Transformer for Video Retrieval , 2020, ECCV.
[39] Xirong Li,et al. Dual Encoding for Zero-Example Video Retrieval , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).