Residual Attention-Based Fusion for Video Classification
暂无分享,去创建一个
[1] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[4] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[5] Matthieu Cord,et al. MUTAN: Multimodal Tucker Fusion for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[6] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[7] Mei-Ling Shyu,et al. Multimodal deep learning based on multiple correspondence analysis for disaster management , 2018, World Wide Web.
[8] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[9] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.