Toward Gender-Inclusive Coreference Resolution: An Analysis of Gender and Bias Throughout the Machine Learning Lifecycle*

Abstract Correctly resolving textual mentions of people fundamentally entails making inferences about those people. Such inferences raise the risk of systematic biases in coreference resolution systems, including biases that can harm binary and non-binary trans and cis stakeholders. To better understand such biases, we foreground nuanced conceptualizations of gender from sociology and sociolinguistics, and investigate where in the machine learning pipeline such biases can enter a coreference resolution system. We inspect many existing data sets for trans-exclusionary biases, and develop two new data sets for interrogating bias in both crowd annotations and in existing coreference resolution systems. Through these studies, conducted on English text, we confirm that without acknowledging and building systems that recognize the complexity of gender, we will build systems that fail for: quality of service, stereotyping, and over- or under-representation, especially for binary and non-binary trans users.

[1]  Ajit Narayanan,et al.  On Abstract Finite-State Morphology , 1993, EACL.

[2]  James R. Glass,et al.  Segmentation for English-to-Arabic Statistical Machine Translation , 2008, ACL.

[3]  Ona de Gibert,et al.  Hate Speech Dataset from a White Supremacy Forum , 2018, ALW.

[4]  Alon Itai,et al.  Learning Morpho-Lexical Probabilities from an Untagged Corpus with an Application to Hebrew , 1995, CL.

[5]  Guodong Zhou,et al.  User Classification with Multiple Textual Perspectives , 2016, COLING.

[6]  Elisabeth André,et al.  Improving Automatic Emotion Recognition from Speech via Gender Differentiaion , 2006, LREC.

[7]  John D. Burger,et al.  Discriminating Gender on Twitter , 2011, EMNLP.

[8]  Shaowen Bardzell,et al.  IwC Special Issue "Feminism and HCI: New Perspectives" Special Issue Editors' Introduction , 2011, Interact. Comput..

[9]  Inderjeet Mani,et al.  Identifying Unknown Proper Names in Newswire Text , 1996 .

[10]  Marlis Hellinger,et al.  Gender Across Languages: Volume 4 , 2015 .

[11]  Meg Barker,et al.  Genderqueer and non-binary genders , 2017 .

[12]  Emiel Krahmer,et al.  Computational Generation of Referring Expressions: A Survey , 2012, CL.

[13]  Yang Xu,et al.  Is Nike female? Exploring the role of sound symbolism in predicting brand name gender , 2018, EMNLP.

[14]  Arjun Mukherjee,et al.  Improving Gender Classification of Blog Authors , 2010, EMNLP.

[15]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[16]  Wolfgang Minker,et al.  Effects of Gender Stereotypes on Trust and Likability in Spoken Human-Robot Interaction , 2018, LREC.

[17]  Jörg Tiedemann,et al.  Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.

[18]  Mark Dredze,et al.  Predicting Twitter User Demographics from Names Alone , 2018, PEOPLES@NAACL-HTL.

[19]  Gareth J. F. Jones,et al.  An investigation of broad coverage automatic pronoun resolution for information retrieval , 2003, SIGIR '03.

[20]  Debanjan Ghosh,et al.  Using Sequence Kernels to identify Opinion Entities in Urdu , 2011, CoNLL.

[21]  Luís C. Lamb,et al.  Assessing gender bias in machine translation: a case study with Google Translate , 2018, Neural Computing and Applications.

[22]  Ian Stewart,et al.  Now We Stronger than Ever: African-American English Syntax in Twitter , 2014, EACL.

[23]  Zijian Wang,et al.  It’s going to be okay: Measuring Access to Support in Online Communities , 2018, EMNLP.

[24]  Sanda M. Harabagiu,et al.  Knowledge-Lean Coreference Resolution and its Relation to Textual Cohesion and Coherence , 1999, Workshop On The Relation Of Discourse/Dialogue Structure And Reference.

[25]  Timnit Gebru,et al.  Datasheets for datasets , 2018, Commun. ACM.

[26]  Leonhard Voltmer,et al.  From Tombstones to Corpora: TSML for Research on Language, Culture, Identity and Gender Differences , 2007, PACLIC.

[27]  Philipp Koehn,et al.  Aiding Pronoun Translation with Co-Reference Resolution , 2010, WMT@ACL.

[28]  Xuanjing Huang,et al.  Investigating Language Universal and Specific Properties in Word Embeddings , 2016, ACL.

[29]  Elizabeth Du,et al.  Anaphora in natural language processing and information retrieval , 1990, Inf. Process. Manag..

[30]  Apoorv Agarwal,et al.  Key Female Characters in Film Have More to Talk About Besides Men: Automating the Bechdel Test , 2015, HLT-NAACL.

[31]  Matt Post,et al.  A Call for Clarity in Reporting BLEU Scores , 2018, WMT.

[32]  Nizar Habash,et al.  A Corpus for Modeling Morpho-Syntactic Agreement in Arabic: Gender, Number and Rationality , 2011, ACL.

[33]  Rivka Levitan,et al.  Entrainment in Spoken Dialogue Systems: Adopting, Predicting and Influencing User Behavior , 2013, NAACL.

[34]  Annamarie Jagose,et al.  Queer Theory: An Introduction , 1996 .

[35]  Yejin Choi,et al.  Connotation Frames of Power and Agency in Modern Films , 2017, EMNLP.

[36]  James R. Glass,et al.  Syntactic Phrase Reordering for English-to-Arabic Statistical Machine Translation , 2009, EACL.

[37]  Naomi S. Baron A reanalysis of english grammatical gender , 1971 .

[38]  Andy Way,et al.  Getting Gender Right in Neural Machine Translation , 2019, EMNLP.

[39]  Mike Kestemont,et al.  Function Words in Authorship Attribution. From Black Magic to Theory? , 2014, CLfL@EACL.

[40]  Tomas Krilavicius,et al.  Stylometric Analysis of Parliamentary Speeches: Gender Dimension , 2017, BSNLP@EACL.

[41]  E. H. Hutten,et al.  SEMANTICS , 1953, The British Journal for the Philosophy of Science.

[42]  Sean A. Munson,et al.  Unequal Representation and Gender Stereotypes in Image Search Results for Occupations , 2015, CHI.

[43]  Rachael Tatman,et al.  Gender and Dialect Bias in YouTube’s Automatic Captions , 2017, EthNLP@EACL.

[44]  Chiara Reali,et al.  Influences of grammatical and stereotypical gender during reading: eye movements in pronominal and noun phrase anaphor resolution , 2014 .

[45]  Michael Spivak The Joy of Tex: A Gourmet Guide to Typesetting with the Ams-Tex Macro Package , 1990 .

[46]  Noah A. Smith,et al.  Context-Based Morphological Disambiguation with Random Fields , 2005, HLT.

[47]  Thierry Bazillon,et al.  Using MMIL for the High Level Semantic Annotation of the French MEDIA Dialogue Corpus , 2011, IWCS.

[48]  Bei Yu,et al.  Function Words for Chinese Authorship Attribution , 2012, CLfL@NAACL-HLT.

[49]  Jan Snajder,et al.  Aspect-Oriented Opinion Mining from User Reviews in Croatian , 2013, BSNLP@ACL.

[50]  Mari Ostendorf,et al.  Characterizing the Language of Online Communities and its Relation to Community Reception , 2016, EMNLP.

[51]  Chandler May,et al.  Social Bias in Elicited Natural Language Inferences , 2017, EthNLP@EACL.

[52]  Ani Nenkova,et al.  Entity-Switched Datasets: An Approach to Auditing the In-Domain Robustness of Named Entity Recognition Models , 2020, ArXiv.

[53]  Daniel Jurafsky,et al.  He Said, She Said: Gender in the ACL Anthology , 2012, Discoveries@ACL.

[54]  Hidetsugu Nanba,et al.  Automatic Compilation of Travel Information from Automatically Identified Travel Blogs , 2009, ACL.

[55]  Jacob Eisenstein,et al.  Stylistic Variation in Social Media Part-of-Speech Tagging , 2018, ArXiv.

[56]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[57]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.

[58]  Vivi Nastase,et al.  What’s in a name? In some languages, grammatical gender , 2009, EMNLP.

[59]  Marta R. Costa-jussà,et al.  Equalizing Gender Bias in Neural Machine Translation with Word Embeddings Techniques , 2019, Proceedings of the First Workshop on Gender Bias in Natural Language Processing.

[60]  Pascale Fung,et al.  Reducing Gender Bias in Abusive Language Detection , 2018, EMNLP.

[61]  Nancy Fraser,et al.  Abnormal Justice , 2008, Critical Inquiry.

[62]  Mitchell P. Marcus,et al.  OntoNotes : A Large Training Corpus for Enhanced Processing , 2017 .

[63]  Maarten Sap,et al.  Extracting Human Temporal Orientation from Facebook Language , 2015, NAACL.

[64]  Guodong Zhou,et al.  Semi-supervised Gender Classification with Joint Textual and Social Modeling , 2016, COLING.

[65]  Chris Mellish Implementing Systemic Classification by Unification , 1988, Comput. Linguistics.

[66]  Jingjing Li,et al.  PAL: A Chatterbot System for Answering Domain-specific Questions , 2013, ACL.

[67]  Rudolf Rosa,et al.  Two-step translation with grammatical post-processing , 2011, WMT@EMNLP.

[68]  Senja Pollak,et al.  Reusable workflows for gender prediction , 2018, LREC.

[69]  R. I. Bainbridge Montagovian Definite Clause Grammar , 1985, EACL.

[70]  Vincent Ng,et al.  Coreference Resolution with World Knowledge , 2011, ACL.

[71]  Christopher G. Chute,et al.  Identification of Patients with Congestive Heart Failure using a Binary Classifier: A Case Study , 2003, BioNLP@ACL.

[72]  Maarten Sap,et al.  Developing Age and Gender Predictive Lexica over Social Media , 2014, EMNLP.

[73]  Dong Nguyen,et al.  TweetGenie: Development, Evaluation, and Lessons Learned , 2014, COLING.

[74]  L. Osterhout,et al.  Event-Related Brain Potentials Elicited by Failure to Agree , 1995 .

[75]  Nizar Habash,et al.  Improving Arabic Dependency Parsing with Lexical and Inflectional Morphological Features , 2010, SPMRL@NAACL-HLT.

[76]  Ruslan Mitkov,et al.  Introduction: Special Issue on Anaphora Resolution in Machine Translation and Multilingual NLP , 1999, Machine Translation.

[77]  Stefan Ultes,et al.  Comparison of Gender- and Speaker-adaptive Emotion Recognition , 2014, LREC.

[78]  Helana Darwin,et al.  Doing Gender Beyond the Binary: A Virtual Ethnography , 2017 .

[79]  Benoît Lemaire,et al.  A MDL-based Model of Gender Knowledge Acquisition , 2008, CoNLL.

[80]  Kurt Eberle,et al.  Deriving de/het gender classification for Dutch nouns for rule-based MT generation tasks , 2014, HyTra@EACL.

[81]  Kevin Knight,et al.  Obfuscating Gender in Social Media Writing , 2016, NLP+CSS@EMNLP.

[82]  Claire Cardie,et al.  Understanding the Effect of Gender and Stance in Opinion Expression in Debates on “Abortion” , 2018, PEOPLES@NAACL-HTL.

[83]  Marko Tadić,et al.  Building the Croatian Morphological Lexicon , 2003 .

[84]  Shrikanth S. Narayanan,et al.  A quantitative analysis of gender differences in movies using psycholinguistic normatives , 2015, EMNLP.

[85]  Candace L. Sidner,et al.  Focusing for Interpretation of Pronouns , 1981, CL.

[86]  Tomoki Taniguchi,et al.  A Weighted Combination of Text and Image Classifiers for User Gender Inference , 2015, VL@EMNLP.

[87]  Ahmed Abdelali,et al.  Using Stem-Templates to Improve Arabic POS and Gender/Number Tagging , 2014, LREC.

[88]  Leo Wanner,et al.  On the Relevance of Syntactic and Discourse Features for Author Profiling and Identification , 2017, EACL.

[89]  Scott Weinstein,et al.  Providing a Unified Account of Definite Noun Phrases in Discourse , 1983, ACL.

[90]  Leo Wanner,et al.  How to Use less Features and Reach Better Performance in Author Gender Identification , 2014, LREC.

[91]  A. Hood,et al.  Gender , 2019, Textile History.

[92]  Alexander M. Fraser,et al.  Using subcategorization knowledge to improve case prediction for translation to German , 2013, ACL.

[93]  Zeyu Li,et al.  Learning Gender-Neutral Word Embeddings , 2018, EMNLP.

[94]  Megumi Kameyama,et al.  A Property-Sharing Constraint in Centering , 1986, ACL.

[95]  Saif Mohammad,et al.  Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems , 2018, *SEMEVAL.

[96]  Verónica López-Ludeña,et al.  Source Language Categorization for improving a Speech into Sign Language Translation System , 2011, SLPAT.

[97]  Mari Ostendorf,et al.  A Quantitative Analysis of Lexical Differences Between Genders in Telephone Conversations , 2005, ACL.

[98]  Desmond U. Patton,et al.  Annotating Twitter Data from Vulnerable Populations: Evaluating Disagreement Between Domain Experts and Graduate Student Annotators , 2018 .

[99]  Yang Trista Cao,et al.  Toward Gender-Inclusive Coreference Resolution , 2019, ACL.

[100]  Derek Ruths,et al.  Gender Inference of Twitter Users in Non-English Contexts , 2013, EMNLP.

[101]  Cheris Kramarae,et al.  A Feminist Dictionary , 1985 .

[102]  Jieyu Zhao,et al.  Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods , 2018, NAACL.

[103]  Tatiana Litvinova,et al.  Deception detection in Russian texts , 2017, EACL.

[104]  Nizar Habash,et al.  Rich Morphology Generation Using Statistical Machine Translation , 2012, INLG.

[105]  Katja Filippova,et al.  User Demographics and Language in an Implicit Social Network , 2012, EMNLP.

[106]  Brian Larson,et al.  Gender as a Variable in Natural-Language Processing: Ethical Considerations , 2017, EthNLP@EACL.

[107]  Michael Zock,et al.  Language learning as problem solving , 1988, COLING.

[108]  Teresa de Lauretis Feminism and Its Differences , 1990 .

[109]  Jason Baldridge,et al.  Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns , 2018, TACL.

[110]  Manuel Carreiras,et al.  Representations and processes in the interpretation of pronouns: new evidence from Spanish and French , 1995 .

[111]  Xiaoqiang Luo,et al.  On Coreference Resolution Performance Metrics , 2005, HLT.

[112]  Mary McGee Wood,et al.  Learning a Radically Lexical Grammar , 1994, Workshop On The Balancing Act: Combining Symbolic And Statistical Approaches To Language.

[113]  Hans Uszkoreit,et al.  An ontology of systematic relations for a shared grammar of Slavic , 2000, COLING.

[114]  Joachim Quantz An HPSG Parser Based on Description Logics , 1994, COLING.

[115]  Hector J. Levesque,et al.  The Winograd Schema Challenge , 2011, AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.

[116]  Emre Kıcıman,et al.  Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries , 2018, Front. Big Data.

[117]  Malvina Nissim,et al.  Bleaching Text: Abstract Features for Cross-lingual Gender Prediction , 2018, ACL.

[118]  Matthew P. Aylett,et al.  Referential form, word duration, and modelling the listener in spoken dialogue , 2004 .

[119]  Hinrich Schütze,et al.  Active Learning for Coreference Resolution , 2012, NAACL.

[120]  Ben Verhoeven,et al.  Gender Profiling for Slovene Twitter communication: the Influence of Gender Marking, Content and Style , 2017, BSNLP@EACL.

[121]  Alexandra Chouldechova,et al.  What’s in a Name? Reducing Bias in Bios without Access to Protected Attributes , 2019, NAACL.

[122]  Bennett Kleinberg,et al.  Identifying the sentiment styles of YouTube's vloggers , 2018, EMNLP 2018.

[123]  Dan Roth,et al.  Evaluation of named entity coreference , 2019, Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference.

[124]  Kalervo Järvelin,et al.  The Effect of Anaphor and Ellipsis Resolution on Proximity Searching in a Text Database , 1996, Inf. Process. Manag..

[125]  Tafseer Ahmed Automatic acquisition of Urdu nouns (along with gender and irregular plurals) , 2014, LREC.

[126]  Alexandra Schofield,et al.  Gender-Distinguishing Features in Film Dialogue , 2016, CLfL@NAACL-HLT.

[127]  Ann Light,et al.  HCI as heterodoxy: Technologies of identity and the queering of interaction with computers , 2011, Interact. Comput..

[128]  N. Moosavi Robustness in Coreference Resolution , 2020 .

[129]  Lluís Padró,et al.  RelaxCor Participation in CoNLL Shared Task on Coreference Resolution , 2011, CoNLL Shared Task.

[130]  Corina Koolen,et al.  These are not the Stereotypes You are Looking For: Bias and Fairness in Authorial Gender Attribution , 2017, EthNLP@EACL.

[131]  Alona Fyshe,et al.  Social and Emotional Correlates of Capitalization on Twitter , 2018, PEOPLES@NAACL-HTL.

[132]  Lucia Specia,et al.  Personalized Machine Translation: Preserving Original Author Traits , 2016, EACL.

[133]  Damianos Karakos,et al.  Bootstrapping Without the Boot , 2005, HLT.

[134]  K Cain,et al.  The Use of Stereotypical Gender Information in Constructing a Mental Model: Evidence from English and Spanish , 1996, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[135]  Michal Novák,et al.  Cross-lingual Coreference Resolution of Pronouns , 2014, COLING.

[136]  Michel Gagnon,et al.  Poly-co: a multilayer perceptron approach for coreference detection , 2011, CoNLL Shared Task.

[137]  Eric P. Xing,et al.  An Active Learning Approach to Coreference Resolution , 2015, IJCAI.

[138]  P Hagoort,et al.  Gender Electrified: ERP Evidence on the Syntactic Nature of Gender Processing , 1999, Journal of psycholinguistic research.

[139]  Evan David Bradley,et al.  Singular ‘they’ and novel pronouns: gender-neutral, nonbinary, or both? , 2019, Proceedings of the Linguistic Society of America.

[140]  Liviu P. Dinu,et al.  The Romanian Neuter Examined Through A Two-Gender N-Gram Classification System , 2012, LREC.

[141]  Claire Cardie,et al.  Bootstrapping Coreference Classifiers with Multiple Machine Learning Algorithms , 2003, EMNLP.

[142]  Daniel Marcu,et al.  A Large-Scale Exploration of Effective Global Features for a Joint Entity Detection and Tracking Model , 2005, HLT.

[143]  Wajdi Zaghouani,et al.  Arap-Tweet: A Large Multi-Dialect Twitter Corpus for Gender, Age and Language Variety Identification , 2018, LREC.

[144]  Ryan Cotterell,et al.  Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology , 2019, ACL.

[145]  R. Shprintzen,et al.  What's in a name? , 1990, The Cleft palate journal.

[146]  C. Kitzinger,et al.  Doing Gender , 2009 .

[147]  Felix Burkhardt,et al.  A Database of Age and Gender Annotated Telephone Speech , 2010, LREC.

[148]  Łukasz Débowski,et al.  A Reconfigurable Stochastic Tagger for Languages with Complex Tag Structure , 2003 .

[149]  Booncharoen Sirinaovakul,et al.  Introduction to the Special Issue , 2002, Comput. Intell..

[150]  Östen Dahl,et al.  Animacy and the notion of semantic gender , 2000 .

[151]  Ali Dada,et al.  Implementation of the Arabic Numerals and their Syntax in GF , 2007, SEMITIC@ACL.

[152]  David Yarowsky,et al.  Modeling Latent Biographic Attributes in Conversational Genres , 2009, ACL.

[153]  Sarah Ita Levitan,et al.  Identifying Individual Differences in Gender, Ethnicity, and Personality from Dialogue for Deception Detection , 2016, Proceedings of the Second Workshop on Computational Approaches to Deception Detection.

[154]  Uwe Kjær Nissen,et al.  Aspects of translating gender , 2002 .

[155]  Jieyu Zhao,et al.  Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints , 2017, EMNLP.

[156]  Martin C. Emele,et al.  Syntactic and Semantic Transfer with F-Structures , 1998, ACL 1998.

[157]  CHIARA REALI,et al.  Isolating stereotypical gender in a grammatical gender language: Evidence from eye movements , 2014, Applied Psycholinguistics.

[158]  Kei Yoshimoto,et al.  Identifying Zero Pronouns in Japanese Dialogue , 1988, COLING.

[159]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[160]  Costanza Navarretta,et al.  An Algorithm for Resolving Individual and Abstract Anaphora in Danish Texts and Dialogues , 2004, Conference On Reference Resolution And Its Applications.

[161]  Dan Klein,et al.  Coreference Semantics from Web Features , 2012, ACL.

[162]  Dekang Lin,et al.  Bootstrapping Path-Based Pronoun Resolution , 2006, ACL.

[163]  Lee Osterhout,et al.  Brain potentials reflect violations of gender stereotypes , 1997, Memory & cognition.

[164]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[165]  David Yarowsky,et al.  Minimally Supervised Induction of Grammatical Gender , 2003, HLT-NAACL.

[166]  Michael Strube,et al.  Which Coreference Evaluation Metric Do You Trust? A Proposal for a Link-based Entity Aware Metric , 2016, ACL.

[167]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[168]  Saif Mohammad,et al.  SemEval-2018 Task 1: Affect in Tweets , 2018, *SEMEVAL.

[169]  Janet M. Baker,et al.  Research in Large Vocabulary Continuous Speech Recognition , 1994, HLT.

[170]  Maarten Sap,et al.  The role of personality, age, and gender in tweeting about mental illness , 2015, CLPsych@HLT-NAACL.

[171]  Saif Mohammad,et al.  Tracking Sentiment in Mail: How Genders Differ on Emotional Axes , 2011, WASSA@ACL.

[172]  Anke Frank,et al.  Gender Issues in Machine Translation , 2004 .

[173]  David Yarowsky,et al.  Stylometric Analysis of Scientific Articles , 2012, NAACL.

[174]  Susan McRoy,et al.  Enriching partially-specified representations for text realization using an attribute grammar , 2000, INLG.

[175]  Michael Strube,et al.  Evaluation Metrics For End-to-End Coreference Resolution Systems , 2010, SIGDIAL Conference.

[176]  Judith L. Klavans,et al.  Book Reviews: The Balancing Act: Combining Symbolic and Statistical Approaches to Language , 1997, CL.

[177]  Jennifer E. Arnold,et al.  Reference production: Production-internal and addressee-oriented processes , 2008 .

[178]  Rudolf Rosa,et al.  Chimera - Three Heads for English-to-Czech Translation , 2013, WMT@ACL.

[179]  Michael Carl,et al.  Controlling Gender Equality with Shallow NLP Techniques , 2004, COLING.

[180]  Philipp Koehn,et al.  Dirt Cheap Web-Scale Parallel Text from the Common Crawl , 2013, ACL.

[181]  Mohamed Abouelenien,et al.  Identity Deception Detection , 2017, IJCNLP.

[182]  R. Lakoff Language and woman's place , 1973, Language in Society.

[183]  Scott Weinstein,et al.  Control of Inference: Role of Some Aspects of Discourse Structure-Centering , 1981, IJCAI.

[184]  Marta R. Costa-jussà,et al.  Why Catalan-Spanish Neural Machine Translation? Analysis, comparison and combination with standard Rule and Phrase-based technologies , 2017, VarDial.

[185]  Barbara Plank,et al.  Predicting Authorship and Author Traits from Keystroke Dynamics , 2018, PEOPLES@NAACL-HTL.

[186]  José Camacho-Collados,et al.  How Gender and Skin Tone Modifiers Affect Emoji Semantics in Twitter , 2018, *SEMEVAL.

[187]  Thierry Declerck,et al.  Ontology-Based Incremental Annotation of Characters in Folktales , 2012, LaTeCH@EACL.

[188]  Lauren Ackerman,et al.  Syntactic and cognitive issues in investigating gendered coreference , 2019 .

[189]  Julia Hirschberg,et al.  Linguistic Cues to Deception and Perceived Deception in Interview Dialogues , 2018, NAACL.

[190]  Nizar Habash,et al.  Dependency Parsing of Modern Standard Arabic with Lexical and Inflectional Features , 2013, CL.

[191]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[192]  Claire Cardie,et al.  Conundrums in Noun Phrase Coreference Resolution: Making Sense of the State-of-the-Art , 2009, ACL.

[193]  Wen Li,et al.  Gender Prediction for Chinese Social Media Data , 2017, RANLP.

[194]  Alon Lavie,et al.  The CMU Machine Translation Systems at WMT 2014 , 2014, WMT@ACL.

[195]  Barbara Di Eugenio,et al.  Centering: A Parametric Theory and Its Instantiations , 2004, Computational Linguistics.

[196]  Liane Guillou,et al.  Pronoun Translation in English-French Machine Translation: An Analysis of Error Types , 2018, ArXiv.

[197]  Mohit Bansal,et al.  Detecting Linguistic Characteristics of Alzheimer’s Dementia by Interpreting Neural Models , 2018, NAACL.

[198]  Manuela Herman,et al.  Rethinking Context Language As An Interactive Phenomenon , 2016 .

[199]  Bronwyn M. Bjorkman Singular they and the syntactic representation of gender in English , 2017 .

[200]  Martin Haspelmath,et al.  Expression of pronominal subjects , 2013 .

[201]  Barbara J. Risman,et al.  From Doing To Undoing: Gender as We Know It , 2009 .

[202]  Cristina Mota,et al.  Multiword Lexical Acquisition and Dictionary Formalization , 2004 .

[203]  Helen Nissenbaum,et al.  Bias in computer systems , 1996, TOIS.

[204]  Dirk Hovy,et al.  Demographic Factors Improve Classification Performance , 2015, ACL.

[205]  Chris Quirk,et al.  The impact of parse quality on syntactically-informed statistical machine translation , 2006, EMNLP.

[206]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[207]  Martin Plátek,et al.  A Prototype of a Grammar Checker for Czech , 1997, ANLP.

[208]  Emily M. Bender,et al.  Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science , 2018, TACL.

[209]  Jacques Courtin,et al.  Towards A More User-Friendly Correction , 1994, COLING.

[210]  Suzanne J. Kessler,et al.  Gender: An Ethnomethodological Approach , 1985 .

[211]  Christopher D. Manning,et al.  Entity-Centric Coreference Resolution with Model Stacking , 2015, ACL.

[212]  Dingcheng Li,et al.  A Pronoun Anaphora Resolution System based on Factorial Hidden Markov Models , 2011, ACL.

[213]  Mats Malm,et al.  Gender-Based Vocation Identification in Swedish 19th Century Prose Fiction using Linguistic Patterns, NER and CRF Learning , 2015, CLfL@NAACL-HLT.

[214]  Dong Nguyen,et al.  Why Gender and Age Prediction from Tweets is Hard: Lessons from a Crowdsourcing Experiment , 2014, COLING.

[215]  David Yarowsky,et al.  Exploring Demographic Language Variations to Improve Multilingual Sentiment Analysis in Social Media , 2013, EMNLP.

[216]  Morgan Klaus Scheuerman,et al.  Gender Recognition or Gender Reductionism?: The Social Implications of Embedded Gender Recognition Systems , 2018, CHI.

[217]  Michael C. Frank,et al.  Predicting Pragmatic Reasoning in Language Games , 2012, Science.

[218]  Owen Rambow,et al.  Light verb constructions with 'do' and 'be' in Hindi: a tag analysis , 2014, LG-LP@COLING.

[219]  Christopher D. Manning,et al.  Deep Reinforcement Learning for Mention-Ranking Coreference Models , 2016, EMNLP.

[220]  Owen Rambow,et al.  Gender and Power: How Gender and Gender Environment Affect Manifestations of Power , 2014, EMNLP.

[221]  Rada Mihalcea,et al.  Zooming in on Gender Differences in Social Media , 2016, PEOPLES@COLING.

[222]  Vincent Ng,et al.  Supervised Noun Phrase Coreference Research: The First Fifteen Years , 2010, ACL.

[223]  Kristen Schilt,et al.  Doing Gender, Doing Heteronormativity , 2009 .

[224]  Rachel Rudinger,et al.  Gender Bias in Coreference Resolution , 2018, NAACL.

[225]  Nizar Habash,et al.  Identifying Broken Plurals, Irregular Gender, and Rationality in Arabic Text , 2012, EACL.

[226]  Marcello Federico,et al.  Modelling pronominal anaphora in statistical machine translation , 2010, IWSLT.

[227]  Liane Guillou,et al.  Improving Pronoun Translation for Statistical Machine Translation , 2012, EACL.

[228]  Yoav Goldberg,et al.  Word Segmentation, Unknown-word Resolution, and Morphological Agreement in a Hebrew Parsing System , 2013, CL.

[229]  Paolo Rosso,et al.  Learning Multimodal Gender Profile using Neural Networks , 2017, RANLP.

[230]  Julia Serano,et al.  Whipping Girl: A Transsexual Woman on Sexism and the Scapegoating of Femininity , 2007 .

[231]  Jordan L. Boyd-Graber,et al.  Removing the Training Wheels: A Coreference Dataset that Entertains Humans and Challenges Computers , 2015, NAACL.

[232]  Fred Popowich Tree Unification Grammar , 1989, ACL.

[233]  Salim Roukos,et al.  Extracting Social Networks and Biographical Facts From Conversational Speech Transcripts , 2007, ACL.

[234]  Dirk Hovy,et al.  Cross-lingual syntactic variation over age and gender , 2015, CoNLL.

[235]  Heeyoung Lee,et al.  A Multi-Pass Sieve for Coreference Resolution , 2010, EMNLP.

[236]  Yoav Goldberg,et al.  Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them , 2019, NAACL-HLT.

[237]  Randy Goebel,et al.  Glen, Glenda or Glendale: Unsupervised and Semi-supervised Learning of English Noun Gender , 2009, CoNLL.

[238]  Yejin Choi,et al.  Gender Attribution: Tracing Stylometric Evidence Beyond Topic and Genre , 2011, CoNLL.

[239]  Batya Friedman,et al.  Toward inclusive tech policy design: a method for underrepresented voices to strengthen tech policy documents , 2019, Ethics and Information Technology.

[240]  Martha Evens,et al.  Acquisition System for Arabic Noun Morphology , 2002, SEMITIC@ACL.

[241]  Tomaz Erjavec,et al.  Language-independent Gender Prediction on Twitter , 2017, NLP+CSS@ACL.

[242]  Wido Peursen,et al.  How to Establish a Verbal Paradigm on the Basis of Ancient Syriac Manuscripts , 2009, SEMITIC@EACL.

[243]  Chen Chen,et al.  Chinese Zero Pronoun Resolution: An Unsupervised Probabilistic Model Rivaling Supervised Resolvers , 2014, EMNLP.

[244]  Naomi Feldman,et al.  Why discourse affects speakers’ choice of referring expressions , 2015, ACL.

[245]  Denis Paperno,et al.  Distributional Effects of Gender Contrasts Across Categories , 2019 .

[246]  Livio Robaldo,et al.  Disambiguating quantifier scope in DTS , 2009, IWCS.

[247]  Grusha Prasad,et al.  The P600 for singular 'they': How the brain reacts when John decides to treat themselves to sushi , 2018 .

[248]  J. Butler Gender Trouble: Feminism and the Subversion of Identity , 1990 .

[249]  Mario Wandruszka,et al.  Sprachen : vergleichbar und unvergleichlich , 1970 .

[250]  Lyle H. Ungar,et al.  Analyzing Biases in Human Perception of User Age and Gender from Text , 2016, ACL.

[251]  Pedro A. Fuertes-Olivera A corpus-based view of lexical gender in written Business English , 2007 .

[252]  Laurence Danlos,et al.  Morphology and cross dependencies in the synthesis of personal pronouns in Romance languages , 1988, COLING.

[253]  Latanya Sweeney,et al.  Discrimination in online ad delivery , 2013, CACM.

[254]  Levi C. R. Hord Bucking the Linguistic Binary: Gender Neutral Language in English, Swedish, French, and German , 2016 .