Temporal Context Aggregation for Video Retrieval with Contrastive Learning
暂无分享,去创建一个
Jie Shao | Xiangyang Xue | Xin Wen | Bingchen Zhao | X. Xue | Jie Shao | Xin Wen | Bingchen Zhao
[1] Chong-Wah Ngo,et al. Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.
[2] Hao Wang,et al. An image-based near-duplicate video retrieval and localization using improved Edit distance , 2017, Multimedia Tools and Applications.
[3] Hung-Khoon Tan,et al. Scalable detection of partial near-duplicate videos by visual-temporal consistency , 2009, ACM Multimedia.
[4] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[5] Phillip Isola,et al. Contrastive Multiview Coding , 2019, ECCV.
[6] Jinchao Xia,et al. Weakly Supervised EM Process For Temporal Localization Within Video , 2019 .
[7] Alexander Sergeev,et al. Horovod: fast and easy distributed deep learning in TensorFlow , 2018, ArXiv.
[8] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.
[9] Florent Perronnin,et al. Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[10] Qi Tian,et al. SIFT Meets CNN: A Decade Survey of Instance Retrieval , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[11] David G. Lowe,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.
[12] Victor S. Lempitsky,et al. Aggregating Local Deep Features for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[13] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[14] Jiajun Wang,et al. VCDB: A Large-Scale Database for Partial Copy Detection in Videos , 2014, ECCV.
[15] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[16] Ronan Sicre,et al. Particular object retrieval with integral max-pooling of CNN activations , 2015, ICLR.
[17] Fei Wang,et al. Million-scale near-duplicate video retrieval system , 2011, ACM Multimedia.
[18] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[19] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[20] Matthijs Douze,et al. LAMV: Learning to Align and Match Videos with Kernelized Temporal Layers , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[21] Zi Huang,et al. Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval , 2013, IEEE Transactions on Multimedia.
[22] Yoshua Bengio,et al. Learning deep representations by mutual information estimation and maximization , 2018, ICLR.
[23] Yang Feng,et al. Video Re-localization , 2018, ECCV.
[24] Meng Wang,et al. Unsupervised t-Distributed Video Hashing and Its Deep Hashing Extension , 2017, IEEE Transactions on Image Processing.
[25] Meng Wang,et al. Stochastic Multiview Hashing for Large-Scale Near-Duplicate Video Retrieval , 2017, IEEE Transactions on Multimedia.
[26] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[27] Guangfeng Lin,et al. IR Feature Embedded BOF Indexing Method for Near-Duplicate Video Retrieval , 2019, IEEE Transactions on Circuits and Systems for Video Technology.
[28] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[29] Qi Tian,et al. Good Practice in CNN Feature Transfer , 2016, ArXiv.
[30] Gabriela Csurka,et al. Visual categorization with bags of keypoints , 2002, eccv 2004.
[31] Yiannis Kompatsiaris,et al. Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers , 2017, MMM.
[32] Ondrej Chum,et al. CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples , 2016, ECCV.
[33] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Albert Gordo,et al. End-to-End Learning of Deep Visual Representations for Image Retrieval , 2016, International Journal of Computer Vision.
[35] Zi Huang,et al. Multiple feature hashing for real-time large scale near-duplicate video retrieval , 2011, ACM Multimedia.
[36] Kihyuk Sohn,et al. Improved Deep Metric Learning with Multi-class N-pair Loss Objective , 2016, NIPS.
[37] Matti Pietikäinen,et al. Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[38] Cordelia Schmid,et al. Stable Hyper-pooling and Query Expansion for Event Detection , 2013, 2013 IEEE International Conference on Computer Vision.
[39] Ioannis Patras,et al. FIVR: Fine-Grained Incident Video Retrieval , 2018, IEEE Transactions on Multimedia.
[40] Xiaobo Lu,et al. Learning spatial-temporal features for video copy detection by the combination of CNN and RNN , 2018, J. Vis. Commun. Image Represent..
[41] Yulong Xu,et al. MS-RMAC: Multiscale Regional Maximum Activation of Convolutions for Image Retrieval , 2017, IEEE Signal Processing Letters.
[42] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[43] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[44] Yiannis Kompatsiaris,et al. Near-Duplicate Video Retrieval with Deep Metric Learning , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).
[45] Kilian Q. Weinberger,et al. Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.
[46] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[47] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Stéphane Dupont,et al. Towards Good Practices for Image Retrieval Based on CNN Features , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).
[49] Chong-Wah Ngo,et al. Practical elimination of near-duplicates from web video search , 2007, ACM Multimedia.
[50] Christopher Hunt,et al. Notes on the OpenSURF Library , 2009 .
[51] Gongping Yang,et al. Global-view hashing: harnessing global relations in near-duplicate video retrieval , 2018, World Wide Web.
[52] Yichen Wei,et al. Relation Networks for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[53] Yiannis Kompatsiaris,et al. ViSiL: Fine-Grained Spatio-Temporal Video Similarity Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[54] Atsuto Maki,et al. Visual Instance Retrieval with Deep Convolutional Networks , 2014, ICLR.
[55] Yichen Wei,et al. Circle Loss: A Unified Perspective of Pair Similarity Optimization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[56] Hervé Jégou,et al. Negative Evidences and Co-occurences in Image Retrieval: The Benefit of PCA and Whitening , 2012, ECCV.
[57] Cordelia Schmid,et al. Event Retrieval in Large Video Collections with Circulant Temporal Encoding , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[58] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[59] Stella X. Yu,et al. Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[60] Fei Wang,et al. Real-time large scale near-duplicate web video retrieval , 2010, ACM Multimedia.
[61] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[62] Bingchen Zhao,et al. Distilling Visual Priors from Self-Supervised Learning , 2020, ECCV Workshops.
[63] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[64] Kiyoharu Aizawa,et al. Self-similarity-based partial near-duplicate video retrieval and alignment , 2013, International Journal of Multimedia Information Retrieval.