An Analysis of WordNet’s Coverage of Gender Identity Using Twitter and The National Transgender Discrimination Survey

While gender identities in the Western world are typically regarded as binary, our previous work (Hicks et al., 2015) shows that there is more lexical variety of gender identity and the way people identify their gender. There is also a growing need to lexically represent this variety of gender identities. In our previous work, we developed a set of tools and approaches for analyzing Twitter data as a basis for generating hypotheses on language used to identify gender and discuss gender-related issues across geographic regions and population groups in the U.S.A. In this paper we analyze the coverage and relative frequency of the word forms in our Twitter analysis with respect to the National Transgender Discrimination Survey data set, one of the most comprehensive data sets on transgender, gender non-conforming, and gender variant people in the U.S.A. We then analyze the coverage of WordNet, a widely used lexical database, with respect to these identities and discuss some key considerations and next steps for adding gender identity words and their meanings to WordNet.

[1]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[2]  Brian Mustanski,et al.  Exploring the Diversity of Gender and Sexual Orientation Identities in an Online Sample of Transgender Individuals , 2012, Journal of sex research.

[3]  Walter W. Hauck,et al.  Effects of Interviewer Gender, Interviewer Choice, and Item Wording on Responses to Questions Concerning Sexual Behavior , 1996 .

[4]  C. Carver,et al.  American Regional Dialects: A Word Geography , 1987 .

[5]  Jack Chambers,et al.  Region and Language Variation , 2000 .

[6]  J. Grant,et al.  Injustice at Every Turn: A Report of the National Transgender Discrimination Survey , 2011 .

[7]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[8]  Caroline F. Pukall,et al.  Somewhere under the rainbow: Exploring the identities and experiences of trans persons , 2014 .

[9]  John Nerbonne,et al.  How much does Geography Influence Language Variation , 2009 .

[10]  L. Jody,et al.  A Gender Not Listed Here: Genderqueers, Gender Rebels, and OtherWise in the National Transgender Discrimination Survey Author: , 2012 .

[11]  Christiane Fellbaum,et al.  Mining Twitter as a First Step toward Assessing the Adequacy of Gender Identification Terms on Intake Forms , 2015, AMIA.

[12]  Greta R. Bauer,et al.  Sex and Gender Diversity Among Transgender Persons in Ontario, Canada: Results From a Respondent-Driven Sampling Survey , 2014, Journal of sex research.

[13]  Robert Graham,et al.  The Health of Lesbian, Gay, Bisexual, and Transgender People: Building a Foundation for Better Understanding , 2011 .