Stand-Alone Self-Attention in Vision Models
暂无分享,去创建一个
Ashish Vaswani | Jonathon Shlens | Niki Parmar | Prajit Ramachandran | Irwan Bello | Anselm Levskaya | Anselm Levskaya | Ashish Vaswani | Jonathon Shlens | Niki Parmar | Irwan Bello | Prajit Ramachandran
[1] Rafael C. González,et al. Local Determination of a Moving Contrast Edge , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[2] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .
[3] Kunihiko Fukushima,et al. Neocognitron: A hierarchical neural network capable of visual pattern recognition , 1988, Neural Networks.
[4] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[5] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[6] William Bialek,et al. Statistics of Natural Images: Scaling in the Woods , 1993, NIPS.
[7] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[8] Eero P. Simoncelli,et al. Natural image statistics and neural representation. , 2001, Annual review of neuroscience.
[9] Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.
[10] Jean-Michel Morel,et al. A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[11] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[12] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[13] William J. Dally,et al. The GPU Computing Era , 2010, IEEE Micro.
[14] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[15] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[16] Christian Szegedy,et al. DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[17] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[18] Stéphane Mallat,et al. Rigid-Motion Scattering for Texture Classification , 2014, ArXiv.
[19] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[20] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[21] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[22] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[23] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[25] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[26] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[30] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[31] Jia Deng,et al. Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.
[32] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[33] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[34] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[35] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[36] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[37] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[38] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Xi Chen,et al. PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.
[40] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[41] Kaiming He,et al. Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[42] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[43] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Kaiming He,et al. Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[45] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Dustin Tran,et al. Image Transformer , 2018, ICML.
[47] Dustin Tran,et al. Mesh-TensorFlow: Deep Learning for Supercomputers , 2018, NeurIPS.
[48] George Papandreou,et al. MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[49] George Papandreou,et al. Searching for Efficient Multi-Scale Architectures for Dense Image Prediction , 2018, NeurIPS.
[50] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[51] Douglas Eck,et al. Music Transformer , 2018, 1809.04281.
[52] Tomasz Malisiewicz,et al. SuperPoint: Self-Supervised Interest Point Detection and Description , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[53] Shuicheng Yan,et al. A2-Nets: Double Attention Networks , 2018, NeurIPS.
[54] Geoffrey E. Hinton,et al. Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures , 2018, NeurIPS.
[55] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.
[56] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[57] Gang Sun,et al. Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks , 2018, NeurIPS.
[58] Max Welling,et al. Spherical CNNs , 2018, ICLR.
[59] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[60] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[61] Gang Sun,et al. Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[62] Stephen Lin,et al. Local Relation Networks for Image Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[63] Stephen Lin,et al. An Empirical Study of Spatial Attention Mechanisms in Deep Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[64] Quoc V. Le,et al. Attention Augmented Convolutional Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[65] Bo Chen,et al. MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[66] Max Welling,et al. Gauge Equivariant Convolutional Networks and the Icosahedral CNN 1 , 2019 .
[67] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[68] Quoc V. Le,et al. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[69] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[70] Yann Dauphin,et al. Pay Less Attention with Lightweight and Dynamic Convolutions , 2019, ICLR.