Active Vision from Image-Text Multimodal System Learning
暂无分享,去创建一个
[1] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[2] Brendan J. Frey,et al. Learning Wake-Sleep Recurrent Attention Models , 2015, NIPS.
[3] W. Marsden. I and J , 2012 .
[4] Jiaxing Zhang,et al. Attentional Neural Network: Feature Selection Using Cognitive Feedback , 2014, NIPS.
[5] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[7] Honglak Lee,et al. Improved Multimodal Deep Learning with Variation of Information , 2014, NIPS.
[8] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[9] Mario Fritz,et al. A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input , 2014, NIPS.
[10] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.
[11] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[12] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[13] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[14] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[15] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[16] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..
[17] Koray Kavukcuoglu,et al. Visual Attention , 2020, Computational Models for Cognitive Vision.