Joint Learning of NNeXtVLAD, CNN and Context Gating for Micro-Video Venue Classification
暂无分享,去创建一个
Xianglin Huang | Gang Cao | Gege Song | Lifang Yang | Wei Liu | Jianglong Zhang | Gang Cao | Xianglin Huang | Lifang Yang | Jianglong Zhang | Wei Liu | Gege Song
[1] MengChu Zhou,et al. Incorporation of Efficient Second-Order Solvers Into Latent Factor Models for Accurate Prediction of Missing QoS Data , 2018, IEEE Transactions on Cybernetics.
[2] Andrew Zisserman,et al. All About VLAD , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[3] Dong Xu,et al. Learning Rotation-Invariant and Fisher Discriminative Convolutional Neural Networks for Object Detection , 2019, IEEE Transactions on Image Processing.
[4] Meng Wang,et al. Low-Rank Multi-View Embedding Learning for Micro-Video Popularity Prediction , 2018, IEEE Transactions on Knowledge and Data Engineering.
[5] MengChu Zhou,et al. A Nonnegative Latent Factor Model for Large-Scale Sparse Matrices in Recommender Systems via Alternating Direction Method , 2016, IEEE Transactions on Neural Networks and Learning Systems.
[6] MengChu Zhou,et al. Generating Highly Accurate Predictions for Missing QoS Data via Aggregating Nonnegative Latent Factor Models , 2016, IEEE Transactions on Neural Networks and Learning Systems.
[7] Tomás Pajdla,et al. NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[8] Tat-Seng Chua,et al. Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model , 2016, ACM Multimedia.
[9] Ivan Laptev,et al. Learnable pooling with Context Gating for video classification , 2017, ArXiv.
[10] Yang Wang,et al. Video Summarization Using Fully Convolutional Sequence Networks , 2018, ECCV.
[11] MengChu Zhou,et al. Temporal Pattern-Aware QoS Prediction via Biased Non-Negative Latent Factorization of Tensors , 2020, IEEE Transactions on Cybernetics.
[12] Xiaoming Xi,et al. Getting More from One Attractive Scene: Venue Retrieval in Micro-videos , 2018, PCM.
[13] Charless C. Fowlkes,et al. The Open World of Micro-Videos , 2016, ArXiv.
[14] Meng Liu,et al. Online Data Organizer: Micro-Video Categorization by Structure-Guided Multimodal Dictionary Learning , 2019, IEEE Transactions on Image Processing.
[15] Florent Perronnin,et al. Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[16] Nadia Mana,et al. Automatic prediction of individual performance from "thin slices" of social behavior , 2009, ACM Multimedia.
[17] Bin Luo,et al. Tag refinement of micro-videos by learning from multiple data sources , 2017, Multimedia Tools and Applications.
[18] Nitish Srivastava,et al. Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.
[19] Rossano Schifanella,et al. 6 Seconds of Sound and Vision: Creativity in Micro-videos , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[20] John Z. Zhang,et al. Enhancing multi-label music genre classification through ensemble techniques , 2011, SIGIR.
[21] Jingyuan Chen,et al. Multi-Modal Learning: Study on A Large-Scale Micro-Video Data Collection , 2016, ACM Multimedia.
[22] Meng Wang,et al. Towards Micro-video Understanding by Joint Sequential-Sparse Modeling , 2017, ACM Multimedia.
[23] Richard P. Wildes,et al. Spatiotemporal Multiplier Networks for Video Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Tat-Seng Chua,et al. Shorter-is-Better: Venue Category Estimation from Micro-Video , 2016, ACM Multimedia.
[25] Gong Cheng,et al. RIFD-CNN: Rotation-Invariant and Fisher Discriminative Convolutional Neural Networks for Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Shuqiang Jiang,et al. Hierarchy-Dependent Cross-Platform Multi-View Feature Learning for Venue Category Prediction , 2018, IEEE Transactions on Multimedia.
[27] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[28] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[29] Gabriela Csurka,et al. Visual categorization with bags of keypoints , 2002, eccv 2004.
[30] Shuai Li,et al. Symmetric and Nonnegative Latent Factor Models for Undirected, High-Dimensional, and Sparse Networks in Industrial Applications , 2017, IEEE Transactions on Industrial Informatics.
[31] Lei Guo,et al. When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNs , 2018, IEEE Transactions on Geoscience and Remote Sensing.
[32] Qi Tian,et al. Enhancing Micro-video Understanding by Harnessing External Sounds , 2017, ACM Multimedia.
[33] Wei Liu,et al. Joint Learning of LSTMs-CNN and Prototype for Micro-video Venue Classification , 2018, PCM.
[34] Jianping Fan,et al. NeXtVLAD: An Efficient Neural Network to Aggregate Frame-level Features for Large-scale Video Classification , 2018, ECCV Workshops.
[35] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.
[36] Christian Wolf,et al. Sequential Deep Learning for Human Action Recognition , 2011, HBU.