暂无分享,去创建一个
Matthieu Cord | Edouard Grave | Armand Joulin | Gabriel Synnaeve | Piotr Bojanowski | Jakob Verbeek | Hugo Touvron | Alaaeldin El-Nouby | Herv'e J'egou | Mathilde Caron | Edouard Grave | Armand Joulin | Jakob Verbeek | Piotr Bojanowski | Gabriel Synnaeve | Mathilde Caron | Hugo Touvron | Gautier Izacard | M. Cord | Alaaeldin El-Nouby | Herv'e J'egou
[1] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[2] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[3] Alexander Novikov,et al. Tensorizing Neural Networks , 2015, NIPS.
[4] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[5] Torsten Hoefler,et al. Augment Your Batch: Improving Generalization Through Instance Repetition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.
[7] Lexing Ying,et al. SwitchNet: a neural network model for forward and inverse scattering problems , 2018, SIAM J. Sci. Comput..
[8] Ankur Bapna,et al. The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation , 2018, ACL.
[9] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[10] Edouard Grave,et al. Training with Quantization Noise for Extreme Model Compression , 2020, ICLR.
[11] Clément Chatelain,et al. Extraction de séquences numériques dans des documents manuscrits quelconques , 2006 .
[12] Giorgos Tolias,et al. Fine-Tuning CNN Image Retrieval with No Human Annotation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[13] Andrew Zisserman,et al. Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.
[14] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016 .
[15] Matthieu Cord,et al. Training data-efficient image transformers & distillation through attention , 2020, ICML.
[16] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.
[17] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[18] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[19] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[20] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[21] Luke Melas-Kyriazi,et al. Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet , 2021, ArXiv.
[22] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Ross B. Girshick,et al. Fast and Accurate Model Scaling , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Peter Stone,et al. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science , 2017, Nature Communications.
[25] Sivaraman Balakrishnan,et al. How Many Samples are Needed to Estimate a Convolutional Neural Network? , 2018, NeurIPS.
[26] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..
[27] Gabriel Synnaeve,et al. Differentiable Model Compression via Pseudo Quantization Noise , 2021, ArXiv.
[28] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[29] Behnam Neyshabur,et al. Towards Learning Convolutions from Scratch , 2020, NeurIPS.
[30] Guiguang Ding,et al. RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition , 2021, ArXiv.
[31] Kaiming He,et al. Designing Network Design Spaces , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Alexander Kolesnikov,et al. MLP-Mixer: An all-MLP Architecture for Vision , 2021, NeurIPS.
[33] K. Simonyan,et al. High-Performance Large-Scale Image Recognition Without Normalization , 2021, ICML.
[34] Thomas Brox,et al. Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.
[35] Quoc V. Le,et al. Pay Attention to MLPs , 2021, NeurIPS.
[36] Ekin D. Cubuk,et al. Revisiting ResNets: Improved Training and Scaling Strategies , 2021, NeurIPS.
[37] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.
[38] Joan Bruna,et al. Finding the Needle in the Haystack with Convolutions: on the benefits of architectural bias , 2019, NeurIPS.
[39] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Jaehoon Lee,et al. Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes , 2018, ICLR.
[41] Julien Mairal,et al. Emerging Properties in Self-Supervised Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[42] Yi Yang,et al. Random Erasing Data Augmentation , 2017, AAAI.
[43] Quoc V. Le,et al. Randaugment: Practical automated data augmentation with a reduced search space , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[44] Benjamin Recht,et al. Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.
[45] Matthieu Cord,et al. Grafit: Learning fine-grained image representations with coarse labels , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[46] Yang Song,et al. The iNaturalist Species Classification and Detection Dataset , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[47] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[48] Roland Memisevic,et al. How far can we go without convolution: Improving fully-connected networks , 2015, ArXiv.
[49] Seong Joon Oh,et al. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[50] Dustin Tran,et al. Image Transformer , 2018, ICML.
[51] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Théodore Bluche,et al. Deep Neural Networks for Large Vocabulary Handwritten Text Recognition , 2015 .
[53] Luca Maria Gambardella,et al. Deep Big Multilayer Perceptrons for Digit Recognition , 2012, Neural Networks: Tricks of the Trade.
[54] Xiaohua Zhai,et al. Are we done with ImageNet? , 2020, ArXiv.
[55] Matthieu Cord,et al. Going deeper with Image Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[56] Yi Tay,et al. Synthesizer: Rethinking Self-Attention for Transformer Models , 2020, ICML.
[57] Quoc V. Le,et al. Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[58] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[59] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[60] Kilian Q. Weinberger,et al. Deep Networks with Stochastic Depth , 2016, ECCV.
[61] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[62] Jonathan Krause,et al. 3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.
[63] Matthew Richardson,et al. Do Deep Convolutional Nets Really Need to be Deep and Convolutional? , 2016, ICLR.
[64] James Demmel,et al. Large Batch Optimization for Deep Learning: Training BERT in 76 minutes , 2019, ICLR.
[65] Vladlen Koltun,et al. Exploring Self-Attention for Image Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[66] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[67] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.