Learning to Remove: Towards Isotropic Pre-trained BERT Embedding
暂无分享,去创建一个
Jie Zheng | Jie Ren | Ling Gao | Yuxin Liang | Rui Cao | Y. Liang | Jie Ren | Ling Gao | Jie Zheng | Rui Cao
[1] Jordan Rodu,et al. Getting in Shape: Word Embedding SubSpaces , 2019, IJCAI.
[2] Martin Wattenberg,et al. Visualizing and Measuring the Geometry of BERT , 2019, NeurIPS.
[3] Di He,et al. FRAGE: Frequency-Agnostic Word Representation , 2018, NeurIPS.
[4] Edward Curry,et al. Word Re-Embedding via Manifold Dimensionality Retention , 2017, EMNLP.
[5] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[6] Tianlin Liu,et al. Unsupervised Post-processing of Word Vectors via Conceptor Negation , 2018, AAAI.
[7] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[8] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[9] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[10] Christopher D. Manning,et al. Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.
[11] Bofang Li,et al. The (too Many) Problems of Analogical Reasoning with Word Vectors , 2017, *SEMEVAL.
[12] Joao Sedoc,et al. Conceptor Debiasing of Word Representations Evaluated on WEAT , 2019, Proceedings of the First Workshop on Gender Bias in Natural Language Processing.
[13] Di He,et al. Representation Degeneration Problem in Training Natural Language Generation Models , 2019, ICLR.
[14] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[15] Evgeniy Gabrilovich,et al. A word at a time: computing word relatedness using temporal semantic analysis , 2011, WWW.
[16] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[17] Jing Huang,et al. Improving Neural Language Generation with Spectrum Control , 2020, ICLR.
[18] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[19] Ehud Rivlin,et al. Placing search in context: the concept revisited , 2002, TOIS.
[20] Thomas Wolf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[21] Pramod Viswanath,et al. All-but-the-Top: Simple and Effective Postprocessing for Word Representations , 2017, ICLR.
[22] Elia Bruni,et al. Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..
[23] Anna Rumshisky,et al. A Primer in BERTology: What We Know About How BERT Works , 2020, Transactions of the Association for Computational Linguistics.
[24] John B. Goodenough,et al. Contextual correlates of synonymy , 1965, CACM.
[25] Felix Hill,et al. SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity , 2016, EMNLP.
[26] Kawin Ethayarajh,et al. How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings , 2019, EMNLP.
[27] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[28] Felix Hill,et al. SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.
[29] Bill Yuchen Lin,et al. IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization , 2020, AAAI.