Context Limitations Make Neural Language Models More Human-Like
暂无分享,去创建一个
[1] Lukas Galke,et al. Emergent Communication for Understanding Human Language Evolution: What's Missing? , 2022, ArXiv.
[2] Mohit Iyyer,et al. Do Long-Range Language Models Actually Use Long-Range Context? , 2021, EMNLP.
[3] Hiroshi Noji,et al. Modeling Human Sentence Processing with Left-Corner Recurrent Neural Network Grammars , 2021, EMNLP.
[4] Benjamin K. Bergen,et al. Different kinds of cognitive plausibility: why are transformers better than RNNs at predicting N400 amplitude? , 2021, CogSci.
[5] Jacob Andreas,et al. What Context Features Can Transformer Language Models Use? , 2021, ACL.
[6] Kentaro Inui,et al. Lower Perplexity is Not Always Human-Like , 2021, ACL.
[7] Rahma Chaabouni,et al. “LazImpa”: Lazy and Impatient neural agents learn to communicate efficiently , 2020, CONLL.
[8] Robert Frank,et al. Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling , 2020, CMCL.
[9] Richard Roger P. Edward Futrell,et al. Dependency locality as an explanatory principle for word order , 2020, Language.
[10] Roger Levy,et al. On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior , 2020, CogSci.
[11] S. Frank,et al. Human Sentence Processing: Recurrence or Attention? , 2020, CMCL.
[12] Tal Linzen,et al. How Can We Accelerate Progress Towards Human-like Linguistic Generalization? , 2020, ACL.
[13] Richard Futrell,et al. Lossy‐Context Surprisal: An Information‐Theoretic Model of Memory Effects in Sentence Processing , 2020, Cogn. Sci..
[14] Richard Futrell,et al. Universals of word order reflect optimization of grammars for efficient communication , 2020, Proceedings of the National Academy of Sciences.
[15] Roger Levy,et al. Linking artificial and human neural representations of language , 2019, EMNLP.
[16] E. Gibson,et al. How Efficiency Shapes Human Language , 2019, Trends in Cognitive Sciences.
[17] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[18] Masayuki Asahara,et al. UD-Japanese BCCWJ: Universal Dependencies Annotation for the Balanced Corpus of Contemporary Written Japanese , 2018, UDW@EMNLP.
[19] Stefan Frank,et al. Comparing Gated and Simple Recurrent Neural Network Architectures as Models of Human Sentence Processing , 2018, CogSci.
[20] John Hale,et al. Finding syntax in human encephalography with beam search , 2018, ACL.
[21] Daniel Jurafsky,et al. Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context , 2018, ACL.
[22] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[23] Roger Levy,et al. Noisy-context surprisal as a human sentence processing cost model , 2017, EACL.
[24] Masayuki Asahara,et al. Reading-Time Annotations for “Balanced Corpus of Contemporary Written Japanese” , 2016, COLING.
[25] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[26] Emmanuel Dupoux,et al. Cognitive science in the era of artificial intelligence: A roadmap for reverse-engineering the infant language-learner , 2016, Cognition.
[27] Mark Steedman,et al. Assessing Relative Sentence Complexity using an Incremental CCG Parser , 2016, NAACL.
[28] Shravan Vasishth,et al. Cross-linguistic differences in processing double-embedded relative clauses: Working-memory constraints or language statistics? , 2016, CogSci.
[29] Alexandra Birch,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[30] Narayanan Srinivasan,et al. Strong Expectations Cancel Locality Effects: Evidence from Hindi , 2014, PloS one.
[31] D. Bates,et al. Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.
[32] K. Maekawa. Balanced Corpus of Contemporary Written Japanese , 2008, IJCNLP.
[33] Nathaniel J. Smith,et al. The effect of word predictability on reading time is logarithmic , 2013, Cognition.
[34] Philipp Koehn,et al. Scalable Modified Kneser-Ney Language Model Estimation , 2013, ACL.
[35] William Schuler,et al. A Model of Language Processing as Hierarchic Sequential Prediction , 2013, Top. Cogn. Sci..
[36] A. Clark. Whatever next? Predictive brains, situated agents, and the future of cognitive science. , 2013, The Behavioral and brain sciences.
[37] Roger Levy,et al. Sequential vs. Hierarchical Syntactic Models of Human Incremental Sentence Processing , 2012, CMCL@NAACL-HLT.
[38] Richard L. Lewis,et al. Short-term forgetting in sentence comprehension: Crosslinguistic evidence from verb-final structures , 2010 .
[39] Frank Keller,et al. Data from eye-tracking corpora as evidence for theories of syntactic processing complexity , 2008, Cognition.
[40] R. Levy. Expectation-based syntactic comprehension , 2008, Cognition.
[41] Richard L. Lewis,et al. Computational principles of working memory in sentence comprehension , 2006, Trends in Cognitive Sciences.
[42] Richard L. Lewis,et al. An Activation-Based Model of Sentence Processing as Skilled Memory Retrieval , 2005, Cogn. Sci..
[43] Noam Chomsky. Three Factors in Language Design , 2005, Linguistic Inquiry.
[44] John Hale,et al. A Probabilistic Earley Parser as a Psycholinguistic Model , 2001, NAACL.
[45] N. Cowan. The magical number 4 in short-term memory: A reconsideration of mental storage capacity , 2001, Behavioral and Brain Sciences.
[46] L Konieczny,et al. Locality and Parsing Complexity , 2000, Journal of psycholinguistic research.
[47] E. Gibson. Linguistic complexity: locality of syntactic dependencies , 1998, Cognition.
[48] S. Hochreiter,et al. Long Short-Term Memory , 1997, Neural Computation.
[49] John A. Hawkins,et al. A Performance Theory of Order and Constituency , 1995 .
[50] Joseph H. Greenberg,et al. Some Universals of Grammar with Particular Reference to the Order of Meaningful Elements , 1990, On Language.
[51] Why Does Surprisal From Smaller GPT-2 Models Provide Better Fit to Human Reading Times? , 2022 .
[52] William Schuler,et al. Surprisal Estimators for Human Reading Times Need Character Models , 2021, ACL.
[53] Koki Washio,et al. On the Relationship between Zipf’s Law of Abbreviation and Interfering Noise in Emergent Languages , 2021, ACL.
[54] Richard Futrell,et al. Modeling word and morpheme order in natural language as an efficient trade-off of memory and surprisal. , 2020, Psychological review.
[55] Eghbal A. Hosseini,et al. The neural architecture of language: Integrative reverse-engineering converges on a model for predictive processing , 2020 .
[56] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[57] Adam Goodkind,et al. Predictive power of word surprisal for reading times is a linear function of language model quality , 2018, CMCL.
[58] Maria Barrett,et al. The Dundee Treebank , 2015 .
[59] M. Crocker. Computational Psycholinguistics , 2009 .
[60] Edward Gibson,et al. Distinguishing theories of syntactic expectation cost in sentence comprehension: evidence from Japanese , 2008 .
[61] Taku Kudo,et al. MeCab : Yet Another Part-of-Speech and Morphological Analyzer , 2005 .
[62] E. Gibson. The dependency locality theory: A distance-based theory of linguistic complexity. , 2000 .
[63] G. A. Miller. THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .