Multilingual Language Models Predict Human Reading Behavior

We analyze if large language models are able to predict patterns of human reading behavior. We compare the performance of language-specific and multilingual pretrained transformer models to predict reading time measures reflecting natural human sentence processing on Dutch, English, German, and Russian texts. This results in accurate models of human reading behavior, which indicates that transformer models implicitly encode relative importance in language in a way that is comparable to human processing mechanisms. We find that BERT and XLM models successfully predict a range of eye tracking features. In a series of experiments, we analyze the cross-domain and cross-language abilities of these models and show how they reflect human sentence processing.

[1]  Jorge P'erez,et al.  Spanish Pre-trained BERT Model and Evaluation Data , 2023, ArXiv.

[2]  Sandro Pezzelle,et al.  Generating Image Descriptions via Sequential Cross-Modal Alignment Guided by Human Gaze , 2020, EMNLP.

[3]  Andreas Bulling,et al.  Improving Natural Language Processing Tasks with Human Gaze-Guided Neural Attention , 2020, NeurIPS.

[4]  Jasmijn Bastings,et al.  The elephant in the interpretability room: Why use attention as explanation when we have saliency methods? , 2020, BLACKBOXNLP.

[5]  Benjamin K. Bergen,et al.  How well does surprisal explain N400 amplitude under different experimental conditions? , 2020, CONLL.

[6]  Roger Levy,et al.  Bridging Information-Seeking Human Gaze and Machine Reading Comprehension , 2020, CONLL.

[7]  Nancy Kanwisher,et al.  Artificial Neural Networks Accurately Predict Language Processing in the Brain , 2020 .

[8]  Omer Levy,et al.  Emergent linguistic structure in artificial neural networks trained by self-supervision , 2020, Proceedings of the National Academy of Sciences.

[9]  Roger Levy,et al.  On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior , 2020, CogSci.

[10]  Stefan L. Frank,et al.  Comparing Transformers and RNNs on predicting human sentence processing data , 2020, ArXiv.

[11]  Nicola De Cao,et al.  How Do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking , 2020, EMNLP.

[12]  Xiaodong Fan,et al.  XGLUE: A New Benchmark Datasetfor Cross-lingual Pre-training, Understanding and Generation , 2020, EMNLP.

[13]  Kentaro Inui,et al.  Attention Is Not Only a Weight: Analyzing Transformers with Vector Norms , 2020, EMNLP.

[14]  Thierry Poibeau,et al.  Multi-SimLex: A Large-Scale Evaluation of Multilingual and Crosslingual Lexical Semantic Similarity , 2020, Computational Linguistics.

[15]  Tommaso Caselli,et al.  BERTje: A Dutch BERT Model , 2019, ArXiv.

[16]  Tapio Salakoski,et al.  Multilingual is not enough: BERT for Finnish , 2019, ArXiv.

[17]  Myle Ott,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[18]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[19]  Ce Zhang,et al.  ZuCo 2.0: A Dataset of Physiological Recordings During Natural Reading and Annotation , 2019, LREC.

[20]  Allyson Ettinger,et al.  What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models , 2019, TACL.

[21]  Eva Schlinger,et al.  How Multilingual is Multilingual BERT? , 2019, ACL.

[22]  Willem Zuidema,et al.  Blackbox Meets Blackbox: Representational Similarity & Stability Analysis of Neural Language Models and Brains , 2019, BlackboxNLP@ACL.

[23]  Leila Wehbe,et al.  Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain) , 2019, NeurIPS.

[24]  Mikhail Arkhipov,et al.  Adaptation of Deep Bidirectional Multilingual Transformers for Russian Language , 2019, ArXiv.

[25]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[26]  Phillip M. Alday M/EEG analysis of naturalistic stories: a review from speech to language processing , 2018, Language, Cognition and Neuroscience.

[27]  Joachim Bingel,et al.  Sequence Classification with Human Attention , 2018, CoNLL.

[28]  Irina A. Sekerina,et al.  Russian Sentence Corpus: Benchmark measures of eye movements in reading in Russian , 2018, Behavior research methods.

[29]  Lalana Kagal,et al.  Explaining Explanations: An Overview of Interpretability of Machine Learning , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[30]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[31]  John Hale,et al.  Models of Human Sentence Comprehension in Computational Psycholinguistics , 2017 .

[32]  Anders Søgaard,et al.  Using Gaze to Predict Text Readability , 2017, BEA@EMNLP.

[33]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[34]  Wouter Duyck,et al.  Presenting GECO: An eyetracking corpus of monolingual and bilingual sentence reading , 2017, Behavior research methods.

[35]  Frank Keller,et al.  Modeling Human Reading with Neural Attention , 2016, EMNLP.

[36]  Joachim Bingel,et al.  Weakly Supervised Part-of-speech Tagging Using Eye-tracking Data , 2016, ACL.

[37]  Anna Gladkova,et al.  Intrinsic Evaluations of Word Embeddings: What Can We Do Better? , 2016, RepEval@ACL.

[38]  Sigrid Klerke,et al.  Improving sentence compression by learning to predict gaze , 2016, NAACL.

[39]  Jukka Hyönä,et al.  Universality in eye movements and reading: A trilingual investigation , 2016, Cognition.

[40]  Tom M. Mitchell,et al.  Aligning context-based statistical models of language with brain activity during reading , 2014, EMNLP.

[41]  Anders Søgaard,et al.  With Blinkers on: Robust Prediction of Eye Movements across Readers , 2013, EMNLP.

[42]  Frank Keller,et al.  Cognitively Plausible Models of Human Language Processing , 2010, ACL.

[43]  K. Rayner,et al.  Reading spaced and unspaced Chinese text: evidence from eye movements. , 2008, Journal of experimental psychology. Human perception and performance.

[44]  K. Rayner Eye movements in reading and information processing: 20 years of research. , 1998, Psychological bulletin.

[45]  K. Rayner Visual attention in reading: Eye movements reflect cognitive processes , 1977, Memory & cognition.

[46]  R. Flesch A new readability yardstick. , 1948, The Journal of applied psychology.

[47]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[48]  Erik D. Reichle,et al.  Toward a model of eye movement control in reading. , 1998, Psychological review.

[49]  Robin K. Morris,et al.  Lexical and message-level sentence context effects on fixation times in reading. , 1994, Journal of experimental psychology. Learning, memory, and cognition.