Superbizarre Is Not Superb: Derivational Morphology Improves BERT’s Interpretation of Complex Words
暂无分享,去创建一个
Valentin Hofmann | Janet B. Pierrehumbert | Hinrich Schutze | Hinrich Schütze | J. Pierrehumbert | Valentin Hofmann
[1] L. Feldman. Modeling Morphological Processing , 2013 .
[2] Pius ten Hacken. Delineating Derivation and Inflection , 2014 .
[3] M. Taft. Recognition of affixed words and the word frequency effect , 1979, Memory & cognition.
[4] Ryan Cotterell,et al. Joint Semantic Synthesis and Morphological Analysis of the Derived Word , 2017, TACL.
[5] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[6] Mikko Kurimo,et al. Finnish ASR with Deep Transformer Models , 2020, INTERSPEECH.
[7] Marcus Taft,et al. Reading and the Mental Lexicon , 1991 .
[8] K. Rastle,et al. The processing of singular and plural nouns in French and English , 2004 .
[9] R. Harald Baayen,et al. Morphological dynamics in compound processing , 2008 .
[10] Shafiq R. Joty,et al. Mind Your Inflections! Improving NLP for Non-Standard Englishes with Base-Inflection Encoding , 2020, EMNLP.
[11] Suzanna Sia,et al. Tired of Topic Models? Clusters of Pretrained Word Embeddings Make for Fast and Good Topics too! , 2020, EMNLP.
[12] J. Pierrehumbert,et al. Morphological convergence as on-line lexical analogy , 2020, Language.
[13] M. Taft. Prefix Stripping Revisited. , 1981 .
[14] Laurie Beth Feldman,et al. Morphological aspects of language processing. , 1997 .
[15] Laurent Romary,et al. CamemBERT: a Tasty French Language Model , 2019, ACL.
[16] Kenneth Ward Church,et al. Emerging trends: Subwords, seriously? , 2020, Natural Language Engineering.
[17] Philip Gage,et al. A new algorithm for data compression , 1994 .
[18] Parminder Bhatia,et al. Morphological Priors for Probabilistic Neural Word Embeddings , 2016, EMNLP.
[19] Hinrich Schütze,et al. DagoBERT: Generating Derivational Morphology with a Pretrained Language Model , 2020, EMNLP.
[20] Phil Blunsom,et al. Compositional Morphology for Word Representations and Language Modelling , 2014, ICML.
[21] Alexander M. Rush,et al. Character-Aware Neural Language Models , 2015, AAAI.
[22] Christopher D. Manning,et al. Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.
[23] A. Laudanna,et al. Distributional properties of derivational affixes: Implications for processing , 1995 .
[24] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[25] R. Baayen,et al. Reading polymorphemic Dutch compounds: toward a multiple route model of lexical processing. , 2009, Journal of experimental psychology. Human perception and performance.
[26] Daniel Edmiston,et al. A Systematic Analysis of Morphological Content in BERT Models for Multiple Languages , 2020, ArXiv.
[27] Jan Snajder,et al. Obtaining a Better Understanding of Distributional Models of German Derivational Morphology , 2015, IWCS.
[28] Benjamin Heinzerling,et al. Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation , 2019, ACL.
[29] Lars Borin,et al. What is a lexical representation? , 1985, NODALIDA.
[30] C. Pliatsikas,et al. Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding) , 2019, Cortex.
[31] Jianmo Ni,et al. Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects , 2019, EMNLP.
[32] Ryan Cotterell,et al. Context-Aware Prediction of Derivational Word-forms , 2017, EACL.
[33] Quoc V. Le,et al. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.
[34] Marcus Taft,et al. Interactive-activation as a framework for understanding morphological processing , 1994 .
[35] Aline Villavicencio,et al. Incorporating Subword Information into Matrix Factorization Word Embeddings , 2018, ArXiv.
[36] L. Manelis,et al. The processing of affixed words , 1977, Memory & cognition.
[37] Benoît Sagot,et al. What Does BERT Learn about the Structure of Language? , 2019, ACL.
[38] Elena Paslaru Bontas Simperl,et al. A Query Log Analysis of Dataset Search , 2017, ICWE.
[39] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[40] Jacob Eisenstein,et al. Will it Unblend? , 2020, FINDINGS.
[41] M. Taft. A morphological-decomposition model of lexical representation , 1988 .
[42] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[43] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[44] Hinrich Schütze,et al. Predicting the Growth of Morphological Families from Social and Linguistic Factors , 2020, ACL.
[45] Christopher D. Manning,et al. A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.
[46] Jonathan Grainger,et al. Oil the role of derivational affixes in recognizing complex words: Evidence from masked priming , 2003 .
[47] HARALD BAAYEN,et al. Productivity and English derivation: a corpus-based study , 1991 .
[48] Matthew H. Davis,et al. Morphological decomposition based on the analysis of orthography , 2008 .
[49] Ting Liu,et al. CharBERT: Character-aware Pre-trained Language Model , 2020, COLING.
[50] J. Pierrehumbert,et al. Gendered associations of English morphology , 2018 .
[51] R. Baayen,et al. Singulars and plurals in Dutch: Evidence for a parallel dual-route model , 1997 .
[52] Joseph P. Stemberger,et al. Rule-Less Morphology at the Phonology-Lexicon Interface , 1994 .
[53] K. Forster,et al. Lexical storage and retrieval of prefixed words , 1975 .
[54] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[55] Yi Yang,et al. Overcoming Language Variation in Sentiment Analysis with Social Attention , 2015, TACL.
[56] Alfonso Caramazza,et al. Representation and processing of derived words , 1987 .
[57] Goran Glavas,et al. Probing Pretrained Language Models for Lexical Semantics , 2020, EMNLP.
[58] Mari Ostendorf,et al. Exponential Language Modeling Using Morphological Features and Multi-Task Learning , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[59] Roy Schwartz,et al. Show Your Work: Improved Reporting of Experimental Results , 2019, EMNLP.
[60] Alessandro Laudanna,et al. Chapter 18 Units of Representation for Derived Words in the Lexicon , 1992 .
[61] Robert Schreuder,et al. Constraining psycholinguistic models of morphological processing and representation: The role of productivity , 1992 .
[62] Joan L. Bybee,et al. Regular morphology and the lexicon. , 1995 .
[63] Matej Klemen,et al. Enhancing deep neural networks with morphological information , 2020, ArXiv.
[64] Timothy Baldwin,et al. Lexical Normalisation of Short Text Messages: Makn Sens a #twitter , 2011, ACL.
[65] Kawin Ethayarajh,et al. How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings , 2019, EMNLP.
[66] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[67] Dan Roth,et al. A Distributional and Orthographic Aggregation Model for English Derivational Morphology , 2018, ACL.
[68] R. Holloway. The broth in my brother ’ s brothel : Morpho-orthographic segmentation in visual word recognition , 2005 .
[69] Heike Adel,et al. Overview of Character-Based Models for Natural Language Processing , 2017, CICLing.
[70] C. Fowler,et al. The inflected noun system in Serbo-Croatian: Lexical representation of morphological structure , 1987, Memory & cognition.
[71] B. Butterworth,et al. Language Production II: Development, Writing, and Other Language Processes , 1985 .
[72] R. H. Baayen,et al. Morphology in the Mental Lexicon: A Computational Model for Visual Word Recognition , 2000 .
[73] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[74] Richard A. Harshman,et al. Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..
[75] Matthew Henderson,et al. Efficient Intent Detection with Dual Sentence Encoders , 2020, NLP4CONVAI.
[76] David Yarowsky,et al. Paradigm Completion for Derivational Morphology , 2017, EMNLP.
[77] Hinrich Schütze,et al. Word Space , 1992, NIPS.
[78] Marco Marelli,et al. Compositional-ly Derived Representations of Morphologically Complex Words in Distributional Semantics , 2013, ACL.
[79] Hinrich Schutze,et al. Negated LAMA: Birds cannot fly , 2019, ArXiv.
[80] Ingo Plag,et al. Word-Formation in English , 2018 .
[81] M. Taft. Morphological Decomposition and the Reverse Base Frequency Effect , 2004, The Quarterly journal of experimental psychology. A, Human experimental psychology.
[82] Gregory Stump,et al. Some sources of apparent gaps in derivational paradigms , 2018, Morphology.
[83] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[84] R. Baayen,et al. Affixal Homonymy triggers full-form storage, even with inflected words, even in a morphologically rich language , 2000, Cognition.
[85] Andrew Gordon Wilson,et al. Probabilistic FastText for Multi-Sense Word Embeddings , 2018, ACL.
[86] Daniel Jurafsky,et al. Distant supervision for relation extraction without labeled data , 2009, ACL.
[87] Hinrich Schütze,et al. A Graph Auto-encoder Model of Derivational Morphology , 2020, ACL.
[88] P. Gordon,et al. Frequency Effects and the Representational Status of Regular Inflections , 1999 .
[89] Jan Snajder,et al. Predictability of Distributional Semantics in Derivational Word Formation , 2016, COLING.
[90] Tie-Yan Liu,et al. Co-learning of Word Representations and Morpheme Representations , 2014, COLING.
[91] Mike Schuster,et al. Japanese and Korean voice search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[92] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[93] Allyson Ettinger,et al. What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models , 2019, TACL.
[94] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[95] Jonathan Grainger,et al. Differences in the Processing of Prefixes and Suffixes Revealed by a Letter-Search Task , 2015 .
[96] A. Laudanna,et al. Address mechanisms to decomposed lexical entries , 1985 .
[97] R. Baayen,et al. The balance of storage and computation in morphological processing: the role of word formation type, affixal homonymy, and productivity. , 2000, Journal of experimental psychology. Learning, memory, and cognition.