暂无分享,去创建一个
Gabriel Synnaeve | Ronan Collobert | Tatiana Likhomanenko | Qiantong Xu | Alex Rogozhnikov | Ronan Collobert | Gabriel Synnaeve | T. Likhomanenko | Qiantong Xu | A. Rogozhnikov
[1] Matthijs Douze,et al. LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[2] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[3] Torsten Hoefler,et al. Augment Your Batch: Improving Generalization Through Instance Repetition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] M. Seltzer,et al. Transformer-Based Acoustic Modeling for Hybrid Speech Recognition , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Dustin Tran,et al. Image Transformer , 2018, ICML.
[6] Iasonas Kokkinos,et al. MultiGrain: a unified image embedding for classes and instances , 2019, ArXiv.
[7] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[8] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[9] Yi Yang,et al. Random Erasing Data Augmentation , 2017, AAAI.
[10] Quoc V. Le,et al. Attention Augmented Convolutional Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[11] Mohammad Norouzi,et al. SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network , 2021, ArXiv.
[12] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[14] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[15] Shuang Xu,et al. Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Quoc V. Le,et al. Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition , 2020, ArXiv.
[17] Benjamin Recht,et al. Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.
[18] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.
[19] Stephen Lin,et al. Deep Metric Transfer for Label Propagation with Limited Annotated Data , 2018, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[20] Luke S. Zettlemoyer,et al. Transformers with convolutional context for ASR , 2019, ArXiv.
[21] Gabriel Synnaeve,et al. Rethinking Evaluation in ASR: Are Our Models Robust Enough? , 2020, Interspeech.
[22] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[23] Quoc V. Le,et al. Randaugment: Practical automated data augmentation with a reduced search space , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[24] Martin Schmitt,et al. Position Information in Transformers: An Overview , 2021, Computational Linguistics.
[25] Jakob Grue Simonsen,et al. Encoding word order in complex embeddings , 2019, ICLR.
[26] Liu Yang,et al. Long Range Arena: A Benchmark for Efficient Transformers , 2020, ICLR.
[27] Kevin Duh,et al. Very Deep Transformers for Neural Machine Translation , 2020, ArXiv.
[28] Tie-Yan Liu,et al. Rethinking Positional Encoding in Language Pre-training , 2020, ICLR.
[29] Davis Liang,et al. Improve Transformer Models with Better Relative Position Embeddings , 2020, FINDINGS.
[30] Hermann Ney,et al. Analysis of Positional Encodings for Neural Machine Translation , 2019, IWSLT.
[31] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[32] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[33] Mary Williamson,et al. Recipes for Building an Open-Domain Chatbot , 2020, EACL.
[34] Lukasz Kaiser,et al. Universal Transformers , 2018, ICLR.
[35] Matthieu Cord,et al. Training data-efficient image transformers & distillation through attention , 2020, ICML.
[36] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[37] Cordelia Schmid,et al. ViViT: A Video Vision Transformer , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[38] Hermann Ney,et al. A Comparison of Transformer and LSTM Encoder Decoder Models for ASR , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[39] Heng Wang,et al. Is Space-Time Attention All You Need for Video Understanding? , 2021, ICML.
[40] Yannick Estève,et al. TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation , 2018, SPECOM.
[41] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[42] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.
[43] Steve J. Young,et al. Large vocabulary continuous speech recognition using HTK , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[44] Wei Chen,et al. Improving Generalization of Transformer for Speech Recognition with Parallel Schedule Sampling and Relative Positional Embedding , 2019, ArXiv.
[45] Jakob Grue Simonsen,et al. On Position Embeddings in BERT , 2021, ICLR.
[46] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[47] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[48] N. Codella,et al. CvT: Introducing Convolutions to Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[49] Shengfeng Pan,et al. RoFormer: Enhanced Transformer with Rotary Position Embedding , 2021, ArXiv.
[50] Seong Joon Oh,et al. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[51] Edouard Grave,et al. Reducing Transformer Depth on Demand with Structured Dropout , 2019, ICLR.
[52] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[53] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[54] Andrew M. Dai,et al. Music Transformer: Generating Music with Long-Term Structure , 2018, ICLR.