暂无分享,去创建一个
Nitish Srivastava | Hanlin Goh | Shuangfei Zhai | Ruixiang Zhang | Walter A. Talbott | Walter Talbott | Josh Susskind | Chen Huang | Nitish Srivastava | Ruixiang Zhang | J. Susskind | Shuangfei Zhai | Chen Huang | Hanlin Goh | W. Talbott
[1] Xiang Bai,et al. Asymmetric Non-Local Neural Networks for Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[2] Roy Schwartz,et al. Random Feature Attention , 2021, ICLR.
[3] Matthieu Cord,et al. Training data-efficient image transformers & distillation through attention , 2020, ICML.
[4] Leonidas J. Guibas,et al. ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.
[5] Yunchao Wei,et al. CCNet: Criss-Cross Attention for Semantic Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[6] Yee Whye Teh,et al. Set Transformer , 2018, ICML.
[7] 知秀 柴田. 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .
[8] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Quoc V. Le,et al. Pay Attention to MLPs , 2021, NeurIPS.
[10] Nikolaos Pappas,et al. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention , 2020, ICML.
[11] Yi Tay,et al. Efficient Transformers: A Survey , 2020, ArXiv.
[12] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[13] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[14] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.
[15] A. Yuille,et al. Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation , 2020, ECCV.
[16] S. M. Ali Eslami,et al. PolyGen: An Autoregressive Generative Model of 3D Meshes , 2020, ICML.
[17] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[18] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.
[19] Aurko Roy,et al. Efficient Content-Based Sparse Attention with Routing Transformers , 2021, TACL.
[20] Yann Dauphin,et al. Pay Less Attention with Lightweight and Dynamic Convolutions , 2019, ICLR.
[21] Xilin Chen,et al. Interlaced Sparse Self-Attention for Semantic Segmentation , 2019, ArXiv.
[22] Lukasz Kaiser,et al. Rethinking Attention with Performers , 2020, ArXiv.
[23] Timothy P. Lillicrap,et al. Compressive Transformers for Long-Range Sequence Modelling , 2019, ICLR.
[24] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[25] Edouard Grave,et al. Adaptive Attention Span in Transformers , 2019, ACL.
[26] Irwan Bello. LambdaNetworks: Modeling Long-Range Interactions Without Attention , 2021, ICLR.
[27] Alexander Kolesnikov,et al. MLP-Mixer: An all-MLP Architecture for Vision , 2021, NeurIPS.
[28] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[29] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[30] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[31] Yi Tay,et al. Synthesizer: Rethinking Self-Attention for Transformer Models , 2020, ICML.
[32] Xiaogang Wang,et al. Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[33] Xi Chen,et al. PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.
[34] Mohammad Norouzi,et al. Pixel Recursive Super Resolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[35] Han Fang,et al. Linformer: Self-Attention with Linear Complexity , 2020, ArXiv.
[36] Ashish Vaswani,et al. Stand-Alone Self-Attention in Vision Models , 2019, NeurIPS.
[37] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.