Text and Code Embeddings by Contrastive Pre-Training
暂无分享,去创建一个
Peter Welinder | Felipe Petroski Such | Alec Radford | Arvind Neelakantan | Chris Hallacy | Raul Puri | Girish Sastry | Nikolas Tezak | Pranav Shyam | Jesse Michael Han | Jong Wook Kim | Gretchen Krueger | Tao Xu | Lilian Weng | Jesse Michael Han | Jerry Tworek | Qiming Yuan | Jong Wook Kim | Johannes Heidecke | Boris Power | Tyna Eloundou Nekoul | David Schnurr | Kenny Hsu | Madeleine Thompson | Tabarak Khan | Toki Sherbakov | Joanne Jang | Alec Radford | Raul Puri | Chris Hallacy | Arvind Neelakantan | P. Welinder | F. Such | Pranav Shyam | Girish Sastry | Gretchen Krueger | N. Tezak | Jerry Tworek | Lilian Weng | Qiming Yuan | D. Schnurr | T. Sherbakov | Tao Xu | Johannes Heidecke | Boris Power | K. Hsu | Madeleine Thompson | Tabarak Khan | Joanne Jang
[1] Ruslan Salakhutdinov,et al. Towards Debiasing Sentence Representations , 2020, ACL.
[2] Furu Wei,et al. MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers , 2020, NeurIPS.
[3] Jiarun Cao,et al. Whitening Sentence Representations for Better Semantics and Faster Retrieval , 2021, ArXiv.
[4] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[5] Chandler May,et al. On Measuring Social Biases in Sentence Encoders , 2019, NAACL.
[6] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Phillip Isola,et al. Contrastive Multiview Coding , 2019, ECCV.
[8] Marc Brockschmidt,et al. CodeSearchNet Challenge: Evaluating the State of Semantic Code Search , 2019, ArXiv.
[9] Ming-Wei Chang,et al. Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.
[10] Cordelia Schmid,et al. VideoBERT: A Joint Model for Video and Language Representation Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[11] Sosuke Kobayashi,et al. Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations , 2018, NAACL.
[12] Ming-Wei Chang,et al. REALM: Retrieval-Augmented Language Model Pre-Training , 2020, ICML.
[13] William L. Hamilton,et al. End-to-End Training of Neural Retrievers for Open-Domain Question Answering , 2021, ACL.
[14] Hugo Zaragoza,et al. The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..
[15] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.
[16] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[17] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[18] Adam Tauman Kalai,et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.
[19] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[20] Richard A. Harshman,et al. Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..
[21] Danqi Chen,et al. Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.
[22] Pengtao Xie,et al. CERT: Contrastive Self-supervised Learning for Language Understanding , 2020, ArXiv.
[23] Yelong Shen,et al. A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation , 2020, ArXiv.
[24] Gary D. Bader,et al. DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations , 2020, ACL.
[25] John C. Platt,et al. Learning Discriminative Projections for Text Similarity Measures , 2011, CoNLL.
[26] Quoc V. Le,et al. Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision , 2021, ICML.
[27] Ion Stoica,et al. Contrastive Code Representation Learning , 2021, EMNLP.
[28] Jimmy J. Lin,et al. Pretrained Transformers for Text Ranking: BERT and Beyond , 2020, WSDM.
[29] Hua Wu,et al. RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering , 2020, NAACL.
[30] Edouard Grave,et al. Towards Unsupervised Dense Information Retrieval with Contrastive Learning , 2021, ArXiv.
[31] Tianyu Gao,et al. SimCSE: Simple Contrastive Learning of Sentence Embeddings , 2021, EMNLP.
[32] Fabio Petroni,et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , 2020, NeurIPS.
[33] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[34] Eunsol Choi,et al. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.
[35] Jianfeng Gao,et al. A Human Generated MAchine Reading COmprehension Dataset , 2018 .
[36] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[37] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[38] Iryna Gurevych,et al. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.
[39] Christy Dennison,et al. Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets , 2021, NeurIPS.
[40] Pascale Fung,et al. Reducing Gender Bias in Abusive Language Detection , 2018, EMNLP.
[41] Wallace S. Rutkowski,et al. TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE , 2022 .
[42] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[43] Kihyuk Sohn,et al. Improved Deep Metric Learning with Multi-class N-pair Loss Objective , 2016, NIPS.
[44] Timothy P. Lillicrap,et al. Relevance Realization and the Emerging Framework in Cognitive Science , 2012, J. Log. Comput..
[45] D. Cheriton. From doc2query to docTTTTTquery , 2019 .
[46] Yiming Yang,et al. On the Sentence Embeddings from BERT for Semantic Textual Similarity , 2020, EMNLP.
[47] Benjamin Piwowarski,et al. SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval , 2021, ArXiv.
[48] Allan Hanbury,et al. Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling , 2021, SIGIR.
[49] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[50] Douwe Kiela,et al. SentEval: An Evaluation Toolkit for Universal Sentence Representations , 2018, LREC.
[51] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[52] Yann LeCun,et al. Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[53] Ye Li,et al. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval , 2020, ArXiv.
[54] Yann LeCun,et al. Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[55] Jieyu Zhao,et al. Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods , 2018, NAACL.
[56] Ming Zhou,et al. GraphCodeBERT: Pre-training Code Representations with Data Flow , 2020, ICLR.
[57] Xiaocheng Feng,et al. CodeBERT: A Pre-Trained Model for Programming and Natural Languages , 2020, EMNLP.
[58] Honglak Lee,et al. An efficient framework for learning sentence representations , 2018, ICLR.
[59] Iryna Gurevych,et al. BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models , 2021, NeurIPS Datasets and Benchmarks.
[60] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[61] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.
[62] Jeff Johnson,et al. Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.
[63] Ce Liu,et al. Supervised Contrastive Learning , 2020, NeurIPS.
[64] Arvind Narayanan,et al. Semantics derived automatically from language corpora contain human-like biases , 2016, Science.
[65] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.
[66] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[67] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[68] Wojciech Zaremba,et al. Evaluating Large Language Models Trained on Code , 2021, ArXiv.
[69] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[70] Jimmy J. Lin,et al. Multi-Stage Document Ranking with BERT , 2019, ArXiv.
[71] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[72] Yann LeCun,et al. Barlow Twins: Self-Supervised Learning via Redundancy Reduction , 2021, ICML.
[73] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[74] Ildoo Kim,et al. ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision , 2021, ICML.
[75] Kwan Hui Lim,et al. An Unsupervised Sentence Embedding Method by Mutual Information Maximization , 2020, EMNLP.
[76] Kai Zou,et al. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks , 2019, EMNLP.
[77] Christopher Potts,et al. ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction , 2021, ArXiv.
[78] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[79] Dahua Lin,et al. Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination , 2018, ArXiv.
[80] Rachel Rudinger,et al. Gender Bias in Coreference Resolution , 2018, NAACL.