Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders
暂无分享,去创建一个
Nigel Collier | Anna Korhonen | Ivan Vuli'c | Fangyu Liu | A. Korhonen | Ivan Vulic | Fangyu Liu | Nigel Collier
[1] Lidong Bing,et al. Bootstrapped Unsupervised Sentence Representation Learning , 2021, ACL.
[2] Yoav Goldberg,et al. BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models , 2021, ACL.
[3] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[4] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[5] Marco Marelli,et al. A SICK cure for the evaluation of compositional distributional semantic models , 2014, LREC.
[6] Matthew Henderson,et al. Training Neural Response Selection for Task-Oriented Dialogue Systems , 2019, ACL.
[7] Omer Levy,et al. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems , 2019, NeurIPS.
[8] Zhiyong Lu,et al. NCBI disease corpus: A resource for disease name recognition and concept normalization , 2014, J. Biomed. Informatics.
[9] Tianyu Gao,et al. SimCSE: Simple Contrastive Learning of Sentence Embeddings , 2021, EMNLP.
[10] Goran Glavas,et al. Probing Pretrained Language Models for Lexical Semantics , 2020, EMNLP.
[11] Nigel Collier,et al. Normalising Medical Concepts in Social Media Texts by Learning Semantic Representation , 2016, ACL.
[12] Zellig S. Harris,et al. Distributional Structure , 1954 .
[13] Mohammad Taher Pilehvar,et al. A Cluster-based Approach for Improving Isotropy in Contextual Embedding Space , 2021, ACL.
[14] Nigel Collier,et al. COMETA: A Corpus for Medical Entity Linking in the Social Media , 2020, EMNLP.
[15] Honglak Lee,et al. An efficient framework for learning sentence representations , 2018, ICLR.
[16] Jiarun Cao,et al. Whitening Sentence Representations for Better Semantics and Faster Retrieval , 2021, ArXiv.
[17] Gary D. Bader,et al. DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations , 2020, ACL.
[18] Goran Glavas,et al. Is Supervised Syntactic Parsing Beneficial for Language Understanding Tasks? An Empirical Investigation , 2020, EACL.
[19] Eneko Agirre,et al. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings , 2018, ACL.
[20] Marco Basaldella,et al. Self-alignment Pre-training for Biomedical Entity Representations , 2020, ArXiv.
[21] Goran Glavas,et al. LexFit: Lexical Fine-Tuning of Pretrained Language Models , 2021, ACL.
[22] Fuzheng Zhang,et al. ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer , 2021, ACL.
[23] Balaji Lakshminarayanan,et al. AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty , 2020, ICLR.
[24] Kwan Hui Lim,et al. An Unsupervised Sentence Embedding Method by Mutual Information Maximization , 2020, EMNLP.
[25] Nigel Collier,et al. Card-660: Cambridge Rare Word Dataset - a Reliable Benchmark for Infrequent Word Representation Models , 2018, EMNLP 2018.
[26] Furu Wei,et al. Scheduled DropHead: A Regularization Method for Transformer Models , 2020, FINDINGS.
[27] Naveen Arivazhagan,et al. Language-agnostic BERT Sentence Embedding , 2020, ArXiv.
[28] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[29] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[30] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[31] Phillip Isola,et al. Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere , 2020, ICML.
[32] Sang-goo Lee,et al. Self-Guided Contrastive Learning for BERT Sentence Representations , 2021, ACL.
[33] Holger Schwenk,et al. WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia , 2019, EACL.
[34] Yiming Yang,et al. On the Sentence Embeddings from BERT for Semantic Textual Similarity , 2020, EMNLP.
[35] Simone Paolo Ponzetto,et al. Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval , 2021, ECIR.
[36] Felix Hill,et al. SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.
[37] Kenneth Ward Church,et al. Isotropy in the Contextual Embedding Space: Clusters and Manifolds , 2021, ICLR.
[38] Pascal Vincent,et al. Dropout as data augmentation , 2015, ArXiv.
[39] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[40] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[41] Anna Rumshisky,et al. A Primer in BERTology: What We Know About How BERT Works , 2020, Transactions of the Association for Computational Linguistics.
[42] Pramod Viswanath,et al. All-but-the-Top: Simple and Effective Postprocessing for Word Representations , 2017, ICLR.
[43] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[44] Zhiyong Lu,et al. BioCreative V CDR task corpus: a resource for chemical disease relation extraction , 2016, Database J. Biol. Databases Curation.
[45] Jason Weston,et al. Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring , 2020, ICLR.
[46] Eneko Agirre,et al. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.
[47] Eneko Agirre,et al. SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity , 2012, *SEMEVAL.
[48] Tomas Mikolov,et al. Advances in Pre-Training Distributed Word Representations , 2017, LREC.
[49] Kawin Ethayarajh,et al. How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings , 2019, EMNLP.
[50] Sanjeev Arora,et al. A Latent Variable Model Approach to PMI-based Word Embeddings , 2015, TACL.
[51] Claire Cardie,et al. SemEval-2014 Task 10: Multilingual Semantic Textual Similarity , 2014, *SEMEVAL.
[52] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[53] Isabelle Augenstein,et al. A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives , 2021, ACM Comput. Surv..
[54] Paul N. Bennett,et al. COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining , 2021, NeurIPS.
[55] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.
[56] Chris Callison-Burch,et al. PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification , 2015, ACL.
[57] Omer Levy,et al. Are Sixteen Heads Really Better than One? , 2019, NeurIPS.
[58] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[59] Eneko Agirre,et al. *SEM 2013 shared task: Semantic Textual Similarity , 2013, *SEMEVAL.
[60] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[61] Nan Hua,et al. Universal Sentence Encoder for English , 2018, EMNLP.
[62] Iryna Gurevych,et al. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.
[63] Iryna Gurevych,et al. Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks , 2021, NAACL.
[64] Philipp Cimiano,et al. Representing Multilingual Data as Linked Data: the Case of BabelNet 2.0 , 2014, LREC.
[65] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.
[66] Fedor Moiseev,et al. Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned , 2019, ACL.
[67] Felix Hill,et al. Learning Distributed Representations of Sentences from Unlabelled Data , 2016, NAACL.
[68] Goran Glavas,et al. How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions , 2019, ACL.
[69] Olivier Bodenreider,et al. The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..
[70] Amaru Cuba Gyllensten,et al. Semantic Re-tuning with Contrastive Tension , 2021, ICLR.
[71] Guillaume Lample,et al. Word Translation Without Parallel Data , 2017, ICLR.
[72] Anna Rumshisky,et al. Revealing the Dark Secrets of BERT , 2019, EMNLP.
[73] Madian Khabsa,et al. CLEAR: Contrastive Learning for Sentence Representation , 2020, ArXiv.
[74] Thierry Poibeau,et al. Multi-SimLex: A Large-Scale Evaluation of Multilingual and Crosslingual Lexical Semantic Similarity , 2020, Computational Linguistics.
[75] Eneko Agirre,et al. SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation , 2016, *SEMEVAL.
[76] Claire Cardie,et al. SemEval-2015 Task 2: Semantic Textual Similarity, English, Spanish and Pilot on Interpretability , 2015, *SEMEVAL.