Elementwise Language Representation
暂无分享,去创建一个
[1] Steffen Eger,et al. ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models , 2022, ACL.
[2] Eric Villemonte de la Clergerie,et al. MANTa: Efficient Gradient-Based Tokenization for Robust End-to-End Language Modeling , 2022, ArXiv.
[3] Guillem Cucurull,et al. Galactica: A Large Language Model for Science , 2022, ArXiv.
[4] Sergio Gomez Colmenarejo,et al. A Generalist Agent , 2022, Trans. Mach. Learn. Res..
[5] Omer Levy,et al. Models In a Spelling Bee: Language Models Implicitly Learn the Character Composition of Tokens , 2021, NAACL.
[6] Olivier J. H'enaff,et al. Perceiver IO: A General Architecture for Structured Inputs & Outputs , 2021, ICLR.
[7] Hyung Won Chung,et al. Charformer: Fast Character Transformers via Gradient-based Subword Tokenization , 2021, ICLR.
[8] Rami Al-Rfou,et al. ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models , 2021, Transactions of the Association for Computational Linguistics.
[9] Dan Garrette,et al. Canine: Pre-training an Efficient Tokenization-Free Encoder for Language Representation , 2021, TACL.
[10] Wookey Lee,et al. PatentNet: multi-label classification of patent documents using deep learning based language understanding , 2021, Scientometrics.
[11] Alexander M. Rush,et al. Block Pruning For Faster Transformers , 2021, EMNLP.
[12] Oriol Vinyals,et al. Highly accurate protein structure prediction with AlphaFold , 2021, Nature.
[13] Naoaki Okazaki,et al. Joint Optimization of Tokenization and Downstream Model , 2021, FINDINGS.
[14] Graham Neubig,et al. Multi-view Subword Regularization , 2021, NAACL.
[15] Ting Liu,et al. CharBERT: Character-aware Pre-trained Language Model , 2020, COLING.
[16] M. Zaheer,et al. Big Bird: Transformers for Longer Sequences , 2020, NeurIPS.
[17] Nikolaos Pappas,et al. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention , 2020, ICML.
[18] Han Fang,et al. Linformer: Self-Attention with Linear Complexity , 2020, ArXiv.
[19] Jieh Hsiang,et al. Patent classification by fine-tuning BERT language model , 2020, World Patent Information.
[20] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[21] Mohammad Norouzi,et al. Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation , 2020, ACL.
[22] Arman Cohan,et al. Longformer: The Long-Document Transformer , 2020, ArXiv.
[23] Quoc V. Le,et al. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.
[24] Jure Leskovec,et al. Learning to Simulate Complex Physics with Graph Networks , 2020, ICML.
[25] Emma J. Chory,et al. A Deep Learning Approach to Antibiotic Discovery , 2020, Cell.
[26] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.
[27] Ivan Provilkov,et al. BPE-Dropout: Simple and Effective Subword Regularization , 2019, ACL.
[28] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[29] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[30] Xin Jiang,et al. TinyBERT: Distilling BERT for Natural Language Understanding , 2019, FINDINGS.
[31] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[32] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[33] Fedor Moiseev,et al. Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned , 2019, ACL.
[34] Omer Levy,et al. Are Sixteen Heads Really Better than One? , 2019, NeurIPS.
[35] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[36] Jimmy J. Lin,et al. Distilling Task-Specific Knowledge from BERT into Simple Neural Networks , 2019, ArXiv.
[37] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[38] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[39] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[40] Roman Suvorov,et al. Fast and Accurate Patent Classification in Search Engines , 2018, Journal of Physics: Conference Series.
[41] Shaobo Li,et al. DeepPatent: patent classification with convolutional neural networks and word embedding , 2018, Scientometrics.
[42] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[43] Lukasz Kaiser,et al. Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.
[44] Raquel Urtasun,et al. The Reversible Residual Network: Backpropagation Without Storing Activations , 2017, NIPS.
[45] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[46] Sungjoo Lee,et al. Forecasting and identifying multi-technology convergence based on patent data: the case of IT and BT industries in 2020 , 2017, Scientometrics.
[47] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[48] Sora Lim,et al. IPC Multi-label Classification Applying the Characteristics of Patent Documents , 2016, CSA/CUTE.
[49] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[50] Xiang Zhang,et al. Character-level Convolutional Networks for Text Classification , 2015, NIPS.
[51] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[52] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[53] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[54] Geoffrey E. Hinton,et al. Generating Text with Recurrent Neural Networks , 2011, ICML.
[55] Sungjoo Lee,et al. Technological Forecasting & Social Change Business planning based on technological capabilities : Patent analysis for technology-driven roadmapping ☆ , 2009 .
[56] Trademark Office,et al. Manual of patent examining procedure , 2004 .