论文信息 - Probing Pre-Trained Language Models for Cross-Cultural Differences in Values - 字舞流文

Probing Pre-Trained Language Models for Cross-Cultural Differences in Values

Language embeds information about social, cultural, and political values people hold. Prior work has explored potentially harmful social biases encoded in Pre-trained Language Models (PLMs). However, there has been no systematic study investigating how values embedded in these models vary across cultures.In this paper, we introduce probes to study which cross-cultural values are embedded in these models, and whether they align with existing theories and cross-cultural values surveys. We find that PLMs capture differences in values across cultures, but those only weakly align with established values surveys. We discuss implications of using mis-aligned models in cross-cultural settings, as well as ways of aligning PLMs with values surveys.

Isabelle Augenstein | Arnav Arora | Lucie-Aimée Kaffee

[1] Miryam de Lhoneux,et al. Challenges and Strategies in Cross-Cultural NLP , 2022, ACL.

[2] Rebecca Lynn Johnson,et al. The Ghost in the Machine has an American accent: value conflict in GPT-3 , 2022, ArXiv.

[3] Isabelle Augenstein,et al. A Survey on Gender Bias in Natural Language Processing , 2021, ArXiv.

[4] Maya Indira Ganesh,et al. A Word on Machine Ethics: A Response to Jiang et al. (2021) , 2021, ArXiv.

[5] Nigel Collier,et al. Visually Grounded Reasoning across Languages and Cultures , 2021, EMNLP.

[6] David Jurgens,et al. Using Sociolinguistic Variables to Reveal Changing Attitudes Towards Sexuality and Gender , 2021, EMNLP.

[7] Dan Goldwasser,et al. Identifying Morality Frames in Political Tweets using Relational Learning , 2021, EMNLP.

[8] David Lazer,et al. (Mis)alignment Between Stance Expressed in Social Media Data and Public Opinion Surveys , 2021, EMNLP.

[9] Christy Dennison,et al. Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets , 2021, NeurIPS.

[10] Alexandra Luccioni,et al. What’s in the Box? An Analysis of Undesirable Content in the Common Crawl Corpus , 2021, ACL.

[11] Isabelle Augenstein,et al. Quantifying gender bias towards politicians in cross-lingual language models , 2021, PloS one.

[12] Masahiro Kaneko,et al. Unmasking the Mask - Evaluating Social Biases in Masked Language Models , 2021, AAAI.

[13] Emily M. Bender,et al. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.

[14] Daniel de Vassimon Manela,et al. Stereotype and Skew: Quantifying Gender Bias in Pre-trained and Fine-tuned Language Models , 2021, EACL.

[15] Graham Neubig,et al. Word Alignment by Fine-tuning Embeddings on Parallel Corpora , 2021, EACL.

[16] Adam Lopez,et al. Intrinsic Bias Metrics Do Not Correlate with Application Bias , 2020, ACL.

[17] Yejin Choi,et al. Moral Stories: Situated Reasoning about Norms, Intents, Actions, and their Consequences , 2020, EMNLP.

[18] Yejin Choi,et al. Social Chemistry 101: Learning to Reason about Social and Moral Norms , 2020, EMNLP.

[19] Graham Neubig,et al. X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models , 2020, EMNLP.

[20] Goran Glavas,et al. Probing Pretrained Language Models for Lexical Semantics , 2020, EMNLP.

[21] Marius Mosbach,et al. On the Interplay Between Fine-tuning and Sentence-Level Probing for Linguistic Knowledge in Pre-Trained Transformers , 2020, BLACKBOXNLP.

[22] Samuel R. Bowman,et al. CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models , 2020, EMNLP.

[23] D. Song,et al. Aligning AI With Shared Human Values , 2020, ICLR.

[24] Aylin Caliskan,et al. Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases , 2020, AIES.

[25] Siva Reddy,et al. StereoSet: Measuring stereotypical bias in pretrained language models , 2020, ACL.

[26] T. Jackson. The legacy of Geert Hofstede , 2020 .

[27] Anna Rumshisky,et al. A Primer in BERTology: What We Know About How BERT Works , 2020, Transactions of the Association for Computational Linguistics.

[28] Inioluwa Deborah Raji,et al. Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing , 2020, FAT*.

[29] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[30] Noah A. Smith,et al. Social Bias Frames: Reasoning about Social and Power Implications of Language , 2019, ACL.

[31] Myle Ott,et al. Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[32] Davis Liang,et al. Masked Language Model Scoring , 2019, ACL.

[33] Thomas Wolf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[34] Sameer Singh,et al. Do NLP Models Know Numbers? Probing Numeracy in Embeddings , 2019, EMNLP.

[35] Sebastian Riedel,et al. Language Models as Knowledge Bases? , 2019, EMNLP.

[36] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[37] Christopher D. Manning,et al. A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.

[38] Aida Mostafazadeh Davani,et al. Moral Foundations Twitter Corpus: A Collection of 35k Tweets Annotated for Moral Sentiment , 2019, Social Psychological and Personality Science.

[39] Chandler May,et al. On Measuring Social Biases in Sentence Encoders , 2019, NAACL.

[40] David Laniado,et al. Wikipedia Cultural Diversity Dataset: A Complete Cartography for 300 Language Editions , 2019, ICWSM.

[41] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[42] Eric Fleury,et al. Socioeconomic Dependencies of Linguistic Patterns in Twitter: a Multivariate Analysis , 2018, WWW.

[43] Daniel Jurafsky,et al. Word embeddings quantify 100 years of gender and ethnic stereotypes , 2017, Proceedings of the National Academy of Sciences.

[44] Yulia Tsvetkov,et al. Writer Profiling Without the Writer's Text , 2017, SocInfo.

[45] Yejin Choi,et al. Connotation Frames of Power and Agency in Modern Films , 2017, EMNLP.

[46] Joanna Bryson,et al. Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[47] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.

[48] Dirk Hovy,et al. User Review Sites as a Resource for Large-Scale Sociolinguistic Studies , 2015, WWW.

[49] Brendan T. O'Connor,et al. Diffusion of Lexical Change in Social Media , 2012, PloS one.

[50] Gail M. Sullivan,et al. Using Effect Size-or Why the P Value Is Not Enough. , 2012, Journal of graduate medical education.

[51] Eldad Davidov,et al. Refining the theory of basic individual values. , 2012, Journal of personality and social psychology.

[52] K. Hew,et al. Cross-cultural analysis of the Wikipedia community , 2010, J. Assoc. Inf. Sci. Technol..

[53] A. Jaffe. Stance: Sociolinguistic Perspectives , 2009 .

[54] Peter Trudgill,et al. Sociolinguistic Variation and Change , 2001 .

[55] B. Norton. Language, Identity, and the Ownership of English , 1997 .

[56] G. Hofstede,et al. Culture′s Consequences: International Differences in Work-Related Values , 1980 .

[57] Wilson L. Taylor,et al. “Cloze Procedure”: A New Tool for Measuring Readability , 1953 .

[58] Johannes Kiesel,et al. Identifying the Human Values behind Arguments , 2022, ACL.

[59] Dit-Yan Yeung,et al. Probing Toxic Content in Large Pre-Trained Language Models , 2021, ACL.

[60] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[61] M. L. Jones. Hofstede - Culturally questionable? , 2007 .

[62] Geert Hofstede,et al. Culture's Recent Consequences , 2005, IWIPS.

[63] Sydney Gregory,et al. Culture's consequences: international differences in work-related values , 1982 .

[64] W. Labov. The social motivation of a sound change , 1963 .

[65] Dragomir R. Radev,et al. of the Association for Computational Linguistics , 2022 .