Linear Complexity Self-Attention with 3 rd Order Polynomials
暂无分享,去创建一个
Philip H. S. Torr | Grigorios G. Chrysos | S. Zafeiriou | Jiankang Deng | F. Babiloni | Filippos Kokkinos | Matteo Maggioni | Ioannis Marras
[1] Grigorios G. Chrysos,et al. Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study , 2022, NeurIPS.
[2] D. Mahajan,et al. Scalable Interpretability via Polynomials , 2022, NeurIPS.
[3] Grigorios G. Chrysos,et al. The Spectral Bias of Polynomial Neural Networks , 2022, ICLR.
[4] Junjie Yan,et al. cosFormer: Rethinking Softmax in Attention , 2022, ICLR.
[5] Shuicheng Yan,et al. MetaFormer is Actually What You Need for Vision , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Sankalan Pal Chowdhury,et al. Learning the Transformer Kernel , 2021, Trans. Mach. Learn. Res..
[7] Fei Wang,et al. Expressivity and Trainability of Quadratic Networks , 2021, ArXiv.
[8] Grigorios G. Chrysos,et al. Poly-NL: Linear Complexity Non-local Layers With 3rd Order Polynomials , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[9] Matthijs Douze,et al. XCiT: Cross-Covariance Image Transformers , 2021, NeurIPS.
[10] Xiaojie Jin,et al. Refiner: Refining Self-attention for Vision Transformers , 2021, ArXiv.
[11] Nitish Srivastava,et al. An Attention Free Transformer , 2021, ArXiv.
[12] Julien Mairal,et al. Emerging Properties in Self-Supervised Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[13] Grigorios G. Chrysos,et al. Augmenting Deep Classifiers with Polynomial Neural Networks , 2021, ECCV.
[14] Matthieu Cord,et al. Going deeper with Image Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[15] N. Codella,et al. CvT: Introducing Convolutions to Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[16] Chi-Keung Tang,et al. Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Roy Schwartz,et al. Random Feature Attention , 2021, ICLR.
[18] Matthieu Cord,et al. Training data-efficient image transformers & distillation through attention , 2020, ICML.
[19] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[20] Lucy J. Colwell,et al. Rethinking Attention with Performers , 2020, ICLR.
[21] Yi Tay,et al. Efficient Transformers: A Survey , 2020, ACM Comput. Surv..
[22] Nikolaos Pappas,et al. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention , 2020, ICML.
[23] Stefanos Zafeiriou,et al. Deep Polynomial Neural Networks , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[24] Stefanos Zafeiriou,et al. TESA: Tensor Element Self-Attention via Matricization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[26] Yee Whye Teh,et al. Multiplicative Interactions and Where to Find Them , 2020, ICLR.
[27] Ding Liu,et al. Pyramid Attention Networks for Image Restoration , 2020, ArXiv.
[28] A. Yuille,et al. Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation , 2020, ECCV.
[29] Stefanos Zafeiriou,et al. P–nets: Deep Polynomial Neural Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.
[31] Yao-Hung Hubert Tsai,et al. Transformer Dissection: An Unified Understanding for Transformer’s Attention via the Lens of Kernel , 2019, EMNLP.
[32] Dan Xu,et al. Dynamic Graph Message Passing Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Ashish Vaswani,et al. Stand-Alone Self-Attention in Vision Models , 2019, NeurIPS.
[34] Shu-Tao Xia,et al. Second-Order Attention Network for Single Image Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Xuming He,et al. LatentGNN: Learning Efficient Non-local Relations for Visual Recognition , 2019, ICML.
[36] Xiang Li,et al. Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks , 2019, ArXiv.
[37] Chen Change Loy,et al. EDVR: Video Restoration With Enhanced Deformable Convolutional Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[38] Luke S. Zettlemoyer,et al. Transformers with convolutional context for ASR , 2019, ArXiv.
[39] Stephen Lin,et al. GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[40] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[41] Quoc V. Le,et al. Attention Augmented Convolutional Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[42] Shuai Yi,et al. Efficient Attention: Attention with Linear Complexities , 2018, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).
[43] Shuicheng Yan,et al. Graph-Based Global Reasoning Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Errui Ding,et al. Compact Generalized Non-local Network , 2018, NeurIPS.
[45] Shuicheng Yan,et al. A2-Nets: Double Attention Networks , 2018, NeurIPS.
[46] Jun Fu,et al. Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Xinge You,et al. Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition , 2018, ECCV.
[48] In-So Kweon,et al. CBAM: Convolutional Block Attention Module , 2018, ECCV.
[49] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.
[50] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[51] Gang Sun,et al. Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[52] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[53] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[54] Xiaogang Wang,et al. Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[56] Raquel Urtasun,et al. Understanding the Effective Receptive Field in Deep Convolutional Neural Networks , 2016, NIPS.
[57] Steve Renals,et al. Multiplicative LSTM for sequence modelling , 2016, ICLR.
[58] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[59] Shuo Yang,et al. WIDER FACE: A Face Detection Benchmark , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[60] Subhransu Maji,et al. Bilinear CNN Models for Fine-Grained Visual Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[61] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[62] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[63] Cristian Sminchisescu,et al. Semantic Segmentation with Second-Order Pooling , 2012, ECCV.
[64] Geoffrey E. Hinton,et al. Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines , 2010, Neural Computation.
[65] Tamara G. Kolda,et al. Tensor Decompositions and Applications , 2009, SIAM Rev..
[66] Alessandro Foi,et al. Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering , 2007, IEEE Transactions on Image Processing.
[67] Geoffrey E. Hinton,et al. Unsupervised Learning of Image Transformations , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[68] Jean-Michel Morel,et al. A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[69] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.
[70] S. Hochreiter,et al. Long Short-Term Memory , 1997, Neural Computation.
[71] Stephen Lin,et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[72] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[73] Joshua B. Tenenbaum,et al. Separating Style and Content with Bilinear Models , 2000, Neural Computation.
[74] N. J. Cohen,et al. Higher-Order Boltzmann Machines , 1986 .