AAAI Press Formatting Instructions for Authors Using LaTeX -- A Guide

To effectively tackle sexism online, research has focused on automated methods for detecting sexism. In this paper, we use items from psychological scales and adversarial sample generation to 1) provide a codebook for different types of sexism in theory-driven scales and in social media text; 2) test the performance of different sexism detection methods across multiple data sets; 3) provide an overview of strategies employed by humans to remove sexism through minimal changes. Results highlight that current methods seem inadequate in detecting all but the most blatant forms of sexism and do not generalize well to out-of-domain examples. By providing a scale-based codebook for sexism and insights into what makes a statement sexist, we hope to contribute to the development of better and broader models for sexism detection, including reflections on theory-driven approaches to data collection.

[1]  Susan T. Fiske,et al.  The Ambivalent Sexism Inventory: Differentiating hostile and benevolent sexism. , 1996 .

[2]  Freya L. Sonenstein,et al.  Attitudes toward male roles among adolescent males: A discriminant validity analysis , 1994 .

[3]  Ruth E. Fassinger,et al.  Development and Testing of the Attitudes Toward Feminism and the Women's Movement (FWM) Scale , 1994 .

[4]  P. Benson,et al.  Development and Validation of the Sexist Attitudes Toward Women Scale (SATWS) , 1980 .

[5]  Eugene B. Nadler,et al.  Authoritarian Attitudes toward Women, and their Correlates , 1959 .

[6]  Donna Brogan,et al.  Measuring Sex-Role Orientation: A Normative Approach. , 1976 .

[7]  Linda S. Hirsch,et al.  The male role: An investigation of contemporary norms. , 1992 .

[8]  W. Villemez,et al.  A Measure of Individual Differences in Sex Stereotyping and Sex Discrimination: The “Macho” Scale , 1977 .

[9]  Matthew J. Salganik,et al.  Bit by bit: social research in the digital age , 2019, The Journal of mathematical sociology.

[10]  F. J. Rodríguez-Díaz,et al.  Development of the Gender Role Attitudes Scale (GRAS) amongst young Spanish people , 2014, International journal of clinical and health psychology : IJCHP.

[11]  Nancy A. Dreyer,et al.  ISRO: A scale to measure sex-role orientation , 1981 .

[12]  Michael S. Bernstein,et al.  We Are Dynamo: Overcoming Stalling and Friction in Collective Action for Crowd Workers , 2015, CHI.

[13]  D. Levinson,et al.  Traditional family ideology and its relation to personality. , 1955, Journal of personality.

[14]  Eduard Hovy,et al.  Learning the Difference that Makes a Difference with Counterfactually-Augmented Data , 2020, ICLR.

[15]  Mai ElSherief,et al.  #NotOkay: Understanding Gender-Based Violence in Social Media , 2017, ICWSM.

[16]  R. Ashmore,et al.  Construction and validation of the Gender Attitude Inventory, a structured inventory to assess multiple dimensions of gender attitudes , 1995 .

[17]  D. Watson,et al.  Constructing validity: Basic issues in objective scale development , 1995 .

[18]  Annabelle Bender Motz,et al.  The Role Conception Inventory: A Tool for Research in Social Psychology , 1952 .

[19]  Reut Tsarfaty,et al.  Evaluating NLP Models via Contrast Sets , 2020, ArXiv.

[20]  Clifford Kirkpatrick,et al.  The Construction of a Belief-Pattern Scale for Measuring Attitudes toward Feminism , 1936 .

[21]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[22]  Eliot R. Smith,et al.  A short scale of attitudes toward feminism. , 1975 .

[23]  Zeyu Li,et al.  Learning Gender-Neutral Word Embeddings , 2018, EMNLP.

[24]  Martin S. Fiebert Measuring Traditional and Liberated Males' Attitudes , 1983 .

[25]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[26]  Melissa J. Ferguson,et al.  Everyday Sexism: Evidence for Its Incidence, Nature, and Psychological Impact From Three Daily Diary Studies , 2001 .

[27]  Susan T. Fiske,et al.  The Ambivalence Toward Men Inventory , 1999 .

[28]  Dirk Hovy,et al.  Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[29]  Fabian Flöck,et al.  A Total Error Framework for Digital Traces of Humans , 2019, ArXiv.

[30]  R Kalin,et al.  Development and Validation of a Sex-Role Ideology Scale , 1978, Psychological reports.

[31]  Lynda A. King,et al.  Sex-Role Egalitarian Ism Scale , 1997 .

[32]  Òscar Garibo i Orts,et al.  Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter at SemEval-2019 Task 5: Frequency Analysis Interpolation for Hate in Speech Detection , 2019, *SEMEVAL.

[33]  Ronald R. Holden,et al.  Development of the Gender Role Beliefs Scale (GRBS). , 1996 .