Revisiting the Uniform Information Density Hypothesis
暂无分享,去创建一个
Clara Meister | Tiago Pimentel | Roger Levy | Ryan Cotterell | Patrick Haller | Lena Jager | Tiago Pimentel | R. Levy | Clara Meister | Patrick Haller | Ryan Cotterell | Lena Jager | Lena A. Jager
[1] T. Florian Jaeger,et al. Redundancy and reduction: Speakers manage syntactic information density , 2010, Cognitive Psychology.
[2] R. Levy. Expectation-based syntactic comprehension , 2008, Cognition.
[3] Eugene Charniak,et al. Entropy Rate Constancy in Text , 2002, ACL.
[4] Ryan Cotterell,et al. If Beam Search Is the Answer, What Was the Question? , 2020, EMNLP.
[5] Wouter Duyck,et al. Presenting GECO: An eyetracking corpus of monolingual and bilingual sentence reading , 2017, Behavior research methods.
[6] Clara Meister,et al. A Cognitive Regularizer for Language Modeling , 2021, ACL.
[7] Hermann Ney,et al. On structuring probabilistic dependences in stochastic language modelling , 1994, Comput. Speech Lang..
[8] Alexander Clark,et al. Grammaticality, Acceptability, and Probability: A Probabilistic View of Linguistic Knowledge , 2017, Cogn. Sci..
[9] Michael Xavier Collins. Information Density and Dependency Length as Complementary Cognitive Models , 2014, Journal of psycholinguistic research.
[10] Samuel R. Bowman,et al. Neural Network Acceptability Judgments , 2018, Transactions of the Association for Computational Linguistics.
[11] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[12] Nathaniel J. Smith,et al. The effect of word predictability on reading time is logarithmic , 2013, Cognition.
[13] Gabriella Vigliocco,et al. Word surprisal predicts N400 amplitude during reading , 2013, ACL.
[14] Christopher D. Manning,et al. Probabilistic models of word order and syntactic discontinuity , 2005 .
[15] Frank Keller,et al. Data from eye-tracking corpora as evidence for theories of syntactic processing complexity , 2008, Cognition.
[16] Gabriella Vigliocco,et al. Lexical surprisal as a general predictor of reading time , 2012, EACL.
[17] Frank Keller,et al. Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure , 2010, ACL.
[18] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[19] John Hale,et al. A Probabilistic Earley Parser as a Psycholinguistic Model , 2001, NAACL.
[20] Richard Futrell,et al. The Natural Stories Corpus , 2017, LREC.
[21] Sumeet Agarwal,et al. Uniform Information Density Effects on Syntactic Choice in Hindi , 2018 .
[22] R.C. Schaefer,et al. Good as new , 2005, IEEE Industry Applications Magazine.
[23] Kenneth Heafield,et al. KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.
[24] S. Frank,et al. Insensitivity of the Human Sentence-Processing System to Hierarchical Structure , 2011, Psychological science.
[25] D. Speelman,et al. Comparing explanations for the Complexity Principle: evidence from argument realization , 2018, Language and Cognition.
[26] Roger Levy,et al. Speakers optimize information density through syntactic reduction , 2006, NIPS.
[27] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[28] F. Pellegrino,et al. Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche , 2019, Science Advances.
[29] Alice Turk,et al. The Smooth Signal Redundancy Hypothesis: A Functional Explanation for Relationships between Redundancy, Prosodic Prominence, and Duration in Spontaneous Speech , 2004, Language and speech.
[30] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[31] Elizabeth Salesky,et al. A surprisal–duration trade-off across and within the world’s languages , 2021, EMNLP.
[32] Roger Levy,et al. Availability-Based Production Predicts Speakers' Real-time Choices of Mandarin Classifiers , 2019, CogSci.
[33] Lysandre Debut,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[34] T. Jaeger,et al. Proceedings of the Annual Meeting of the Cognitive Science Society , 2008 .
[35] Jelke Bloem,et al. Testing the Processing Hypothesis of word order variation using a probabilistic language model , 2016, CL4LC@COLING 2016.
[36] John Hoeks,et al. Modeling the Noun Phrase versus Sentence Coordination Ambiguity in Dutch: Evidence from Surprisal Theory , 2010, CMCL@ACL.
[37] Adam Goodkind,et al. Predictive power of word surprisal for reading times is a linear function of language model quality , 2018, CMCL.
[38] Dan Jurafsky,et al. Effects of disfluencies, predictability, and utterance position on word form variation in English conversation. , 2003, The Journal of the Acoustical Society of America.
[39] Matthew W. Crocker,et al. Information density of encodings: The role of syntactic variation in comprehension , 2017, CogSci.
[40] Roger Levy,et al. On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior , 2020, CogSci.
[41] Meilin Zhan,et al. Comparing Theories of Speaker Choice Using a Model of Classifier Production in Mandarin Chinese , 2018, NAACL.
[42] Sascha Topolinski,et al. The architecture of intuition: Fluency and affect determine intuitive judgments of semantic and visual coherence and judgments of grammaticality in artificial grammar learning. , 2009, Journal of experimental psychology. General.
[43] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[44] Steven G. Luke,et al. The Provo Corpus: A large eye-tracking corpus with predictability norms , 2018, Behavior research methods.
[45] G. Kuperberg,et al. Word predictability effects are linear, not logarithmic: Implications for probabilistic models of sentence comprehension. , 2021, Journal of memory and language.
[46] Vera Demberg,et al. Uniform Surprisal at the Level of Discourse Relations: Negation Markers and Discourse Connective Omission , 2015, IWCS.
[47] K. Rayner. Eye movements in reading and information processing: 20 years of research. , 1998, Psychological bulletin.
[48] G. Crooks. On Measures of Entropy and Information , 2015 .
[49] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[50] François Christophe Egidio Pellegrino,et al. Across-Language Perspective on Speech Information Rate , 2011 .
[51] S. Piantadosi,et al. Info/information theory: Speakers choose shorter words in predictive contexts , 2013, Cognition.
[52] Yohei Oseki,et al. Lower Perplexity is Not Always Human-Like , 2021, ACL/IJCNLP.
[53] Roger Levy,et al. Communicative Efficiency, Uniform Information Density, and the Rational Speech Act Theory , 2018, CogSci.