A Multimodal Aggregation Network With Serial Self-Attention Mechanism for Micro-Video Multi-Label Classification
暂无分享,去创建一个
[1] Yixiang Lu,et al. Learning view-specific labels and label-feature dependence maximization for multi-view multi-label classification , 2022, Appl. Soft Comput..
[2] Stephen Lin,et al. Video Swin Transformer , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Shih-Fu Chang,et al. VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text , 2021, NeurIPS.
[4] Yu Qiao,et al. Attention-Driven Dynamic Graph Convolutional Network for Multi-label Image Recognition , 2020, ECCV.
[5] Yuexian Zou,et al. Modeling Label Dependencies for Audio Tagging With Graph Convolutional Network , 2020, IEEE Signal Processing Letters.
[6] Zhenzhong Chen,et al. A Multimodal Variational Encoder-Decoder Framework for Micro-video Popularity Prediction , 2020, WWW.
[7] Zheng-Jun Zha,et al. Learning and Fusing Multiple User Interest Representations for Micro-Video and Movie Recommendations , 2020, IEEE Transactions on Multimedia.
[8] Gang Cao,et al. Multi-modal sequence model with gated fully convolutional blocks for micro-video venue classification , 2019, Multimedia Tools and Applications.
[9] Zhiming Luo,et al. Manifold regularized discriminative feature selection for multi-label learning , 2019, Pattern Recognit..
[10] Hefeng Wu,et al. Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[11] Xiaobo Wang,et al. Multi-View Multi-Label Learning with View-Specific Information Extraction , 2019, IJCAI.
[12] Cordelia Schmid,et al. VideoBERT: A Joint Model for Video and Language Representation Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[13] Qi Tian,et al. Enhancing Micro-video Understanding by Harnessing External Sounds , 2017, ACM Multimedia.
[14] Zhi-Hua Zhou,et al. Multi-Label Learning with Global and Local Label Correlation , 2017, IEEE Transactions on Knowledge and Data Engineering.
[15] Yu-Chiang Frank Wang,et al. Learning Deep Latent Spaces for Multi-Label Classification , 2017, ArXiv.
[16] Tat-Seng Chua,et al. Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model , 2016, ACM Multimedia.
[17] Cheng Li,et al. Conditional Bernoulli Mixtures for Multi-label Classification , 2016, ICML.
[18] Yun Fu,et al. Robust Multi-View Subspace Learning through Dual Low-Rank Decompositions , 2016, AAAI.
[19] Limin Wang,et al. Action recognition with trajectory-pooled deep-convolutional descriptors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[21] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Andrew Zisserman,et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.
[23] Murat Akbacak,et al. Softening quantization in bag-of-audio-words , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Vincent W. L. Tam,et al. A Cross-Attention BERT-Based Framework for Continuous Sign Language Recognition , 2022, IEEE Signal Processing Letters.
[25] Chenguang Song,et al. Learning Social Relationship From Videos via Pre-Trained Multimodal Transformer , 2022, IEEE Signal Processing Letters.
[26] Changsheng Xu,et al. Heterogeneous Hierarchical Feature Aggregation Network for Personalized Micro-Video Recommendation , 2022, IEEE Transactions on Multimedia.
[27] Qingling Cai,et al. Multi-Label Classification of Fundus Images With Graph Convolutional Network and Self-Supervised Learning , 2021, IEEE Signal Processing Letters.