A Universal Quaternion Hypergraph Network for Multimodal Video Question Answering
暂无分享,去创建一个
Zhicheng Guo | Licheng Jiao | Fang Liu | Xu Liu | Jiaxuan Zhao | L. Jiao | Jiaxuan Zhao | Fang Liu | Xu Liu | Zhicheng Guo
[1] Guanglu Sun,et al. Video Question Answering: a Survey of Models and Datasets , 2021, Mobile Networks and Applications.
[2] Shaoyi Du,et al. Hypergraph Learning: Methods and Practices , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] Byoung-Tak Zhang,et al. Co-Attentional Transformers for Story-Based Video Understanding , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Jing Liu,et al. Dual Hierarchical Temporal Convolutional Network with QA-Aware Dynamic Normalization for Video Story Question Answering , 2020, ACM Multimedia.
[5] Chang D. Yoo,et al. Modality Shifting Attention Network for Multi-Modal Video Question Answering , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Mohit Bansal,et al. Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA , 2020, ACL.
[7] T. Abdelzaher,et al. Hypergraph Learning with Line Expansion , 2020, ArXiv.
[8] Peng Gao,et al. Character Matters: Video Story Understanding with Character-Aware Relations , 2020, ArXiv.
[9] Byoung-Tak Zhang,et al. DramaQA: Character-Centered Video Story Understanding with Hierarchical QA , 2020, AAAI.
[10] Licheng Yu,et al. Hero: Hierarchical Encoder for Video+Language Omni-representation Pre-training , 2020, EMNLP.
[11] Thao Minh Le,et al. Dynamic Language Binding in Relational Visual Reasoning , 2020, IJCAI.
[12] Jiebo Luo,et al. Joint Commonsense and Relation Reasoning for Image and Video Captioning , 2020, AAAI.
[13] Yahong Han,et al. Reasoning with Heterogeneous Graph Alignment for Video Question Answering , 2020, AAAI.
[14] Yueting Zhuang,et al. Frame Augmented Alternating Attention Network for Video Question Answering , 2020, IEEE Transactions on Multimedia.
[15] Chenhui Chu,et al. BERT Representations for Video Question Answering , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).
[16] Truyen Tran,et al. Hierarchical Conditional Relation Networks for Video Question Answering , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Chiori Hori,et al. Multi-Layer Content Interaction Through Quaternion Product for Visual Question Answering , 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing.
[18] Ruochi Zhang,et al. Hyper-SAGNN: a self-attention based graph neural network for hypergraphs , 2019, ICLR.
[19] Chenhui Chu,et al. KnowIT VQA: Answering Knowledge-Based Questions about Videos , 2019, AAAI.
[20] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[21] Siu Cheung Hui,et al. Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks , 2019, ACL.
[22] Chang D. Yoo,et al. Gaining Extra Supervision via Multi-task learning for Multi-Modal Video Question Answering , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).
[23] Benjamin J Raphael,et al. Random Walks on Hypergraphs with Edge-Dependent Vertex Weights , 2019, ICML.
[24] Licheng Yu,et al. TVQA+: Spatio-Temporal Grounding for Video Question Answering , 2019, ACL.
[25] Junyeong Kim,et al. Progressive Attention Memory Network for Movie Story Question Answering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Song Bai,et al. Hypergraph Convolution and Hypergraph Attention , 2019, Pattern Recognit..
[27] Jitendra Malik,et al. SlowFast Networks for Video Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[28] Yue Gao,et al. Hypergraph Neural Networks , 2018, AAAI.
[29] Tao Mei,et al. Exploring Visual Relationship for Image Captioning , 2018, ECCV.
[30] Byoung-Tak Zhang,et al. Multimodal Dual Attention Memory for Video Story Question Answering , 2018, ECCV.
[31] Partha Pratim Talukdar,et al. HyperGCN: A New Method of Training Graph Convolutional Networks on Hypergraphs , 2018 .
[32] Licheng Yu,et al. TVQA: Localized, Compositional Video Question Answering , 2018, EMNLP.
[33] Ying Zhang,et al. Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition , 2018, INTERSPEECH.
[34] Titouan Parcollet,et al. Quaternion Recurrent Neural Networks , 2018, ICLR.
[35] Dario Pavllo,et al. QuaterNet: A Quaternion-based Recurrent Model for Human Motion , 2018, BMVC.
[36] T.-H. Hubert Chan,et al. Generalizing the Hypergraph Laplacian via a Diffusion Process with Mediators , 2018, COCOON.
[37] Bo Wang,et al. Movie Question Answering: Remembering the Textual Cues for Layered Visual Contents , 2018, AAAI.
[38] Xiao-Ming Wu,et al. Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.
[39] Anthony S. Maida,et al. Deep Quaternion Networks , 2017, 2018 International Joint Conference on Neural Networks (IJCNN).
[40] Gunhee Kim,et al. A Read-Write Memory Network for Movie Story Understanding , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[41] Deng Cai,et al. Unifying the Video and Question Attentions for Open-Ended Video Question Answering , 2017, IEEE Transactions on Image Processing.
[42] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[43] Byoung-Tak Zhang,et al. DeepStory: Video Story QA by Deep Embedded Memory Networks , 2017, IJCAI.
[44] Shih-Fu Chang,et al. Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification , 2017, IEEE Transactions on Multimedia.
[45] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[47] Yale Song,et al. TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Max Welling,et al. Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.
[49] Li-Jia Li,et al. Dense Captioning with Joint Inference and Visual Context , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[50] Max Welling,et al. Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.
[51] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[52] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[53] Sanja Fidler,et al. MovieQA: Understanding Stories in Movies through Question-Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[54] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[55] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[56] Akira Hirose,et al. Quaternion Neural-Network-Based PolSAR Land Classification in Poincare-Sphere-Parameter Space , 2014, IEEE Transactions on Geoscience and Remote Sensing.
[57] Richard Szeliski,et al. Computer Vision - Algorithms and Applications , 2011, Texts in Computer Science.
[58] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[59] Bernhard Schölkopf,et al. Learning with Hypergraphs: Clustering, Classification, and Embedding , 2006, NIPS.
[60] Soo-Chang Pei,et al. Efficient implementation of quaternion Fourier transform, convolution, and correlation by 2-D complex FFT , 2001, IEEE Trans. Signal Process..
[61] Stephen J. Sangwine,et al. Hypercomplex Fourier Transforms of Color Images , 2001, IEEE Transactions on Image Processing.
[62] William Rowan Hamilton,et al. XI. On quaternions; or on a new system of imaginaries in algebra , 1848 .
[63] William Rowan Hamilton,et al. ON QUATERNIONS, OR ON A NEW SYSTEM OF IMAGINARIES IN ALGEBRA , 1847 .
[64] Xiao Wu,et al. Adversarial Multimodal Network for Movie Story Question Answering , 2021, IEEE Transactions on Multimedia.
[65] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.