Examining the Presence of Gender Bias in Customer Reviews Using Word Embedding

Humans have entered the age of algorithms. Each minute, algorithms shape countless preferences from suggesting a product to a potential life partner. In the marketplace algorithms are trained to learn consumer preferences from customer reviews because user-generated reviews are considered the voice of customers and a valuable source of information to firms. Insights mined from reviews play an indispensable role in several business activities ranging from product recommendation, targeted advertising, promotions, segmentation etc. In this research, we question whether reviews might hold stereotypic gender bias that algorithms learn and propagate Utilizing data from millions of observations and a word embedding approach, GloVe, we show that algorithms designed to learn from human language output also learn gender bias. We also examine why such biases occur: whether the bias is caused because of a negative bias against females or a positive bias for males. We examine the impact of gender bias in reviews on choice and conclude with policy implications for female consumers, especially when they are unaware of the bias, and the ethical implications for firms.

[1]  G. Tellis,et al.  Mining Marketing Meaning from Online Chatter: Strategic Brand Analysis of Big Data Using Latent Dirichlet Allocation , 2014 .

[2]  John R. Hauser,et al.  “Listening In” to Find and Explore New Combinations of Customer Needs , 2004 .

[3]  Harmanpreet Kaur,et al.  Putting Users in Control of their Recommendations , 2015, RecSys.

[4]  Eric T. Bradlow,et al.  Automated Marketing Research Using Online Customer Reviews , 2011 .

[5]  John B. Ford,et al.  Gender Role Portrayals in Japanese Advertising: A Magazine Content Analysis , 1998 .

[6]  Brad Wardman,et al.  Voice of the customer , 2013, 2013 APWG eCrime Researchers Summit.

[7]  Jacob Goldenberg,et al.  Mine Your Own Business: Market-Structure Surveillance Through Text Mining , 2012, Mark. Sci..

[8]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[9]  Jeffrey Heer,et al.  Crowdsourcing graphical perception: using mechanical turk to assess visualization design , 2010, CHI.

[10]  M. Kosinski,et al.  Deep Neural Networks Are More Accurate Than Humans at Detecting Sexual Orientation From Facial Images , 2018, Journal of personality and social psychology.

[11]  Berkeley J. Dietvorst,et al.  Algorithm Aversion: People Erroneously Avoid Algorithms after Seeing Them Err , 2014, Journal of experimental psychology. General.

[12]  Brian A. Nosek,et al.  Harvesting implicit group attitudes and beliefs from a demonstration web site , 2002 .

[13]  A. Greenwald,et al.  Measuring individual differences in implicit cognition: the implicit association test. , 1998, Journal of personality and social psychology.

[14]  Barry Smyth,et al.  Case-Studies in Mining User-Generated Reviews for Recommendation , 2015, Advances in Social Media Analysis.

[15]  J. Bohren,et al.  The Dynamics of Discrimination: Theory and Evidence , 2017, American Economic Review.

[16]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[17]  Iryna Gurevych,et al.  Beyond the stars: exploiting free-text user reviews to improve the accuracy of movie recommendations , 2009, TSA@CIKM.

[18]  Tong Wang,et al.  Learning to Detect Patterns of Crime , 2013, ECML/PKDD.

[19]  B. Baden-powell The Land Systems Of British India , 1892 .

[20]  Alan R. Wagner,et al.  Overtrust of robots in emergency evacuation scenarios , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[21]  Marilyn A. Walker,et al.  Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text , 2007, J. Artif. Intell. Res..

[22]  Rebecca Jen-Hui Wang,et al.  Automated Text Analysis for Consumer Research , 2018 .

[23]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[24]  James M. Weyant Implicit Stereotyping of Hispanics: Development and Validity of a Hispanic Version of the Implicit Association Test , 2005 .

[25]  Brian A. Nosek,et al.  Understanding and using the implicit association test: I. An improved scoring algorithm. , 2003, Journal of personality and social psychology.

[26]  J. Horvat THE ETHICS OF ARTIFICIAL INTELLIGENCE , 2016 .

[27]  Jianshu Sun,et al.  Mining Reviews for Product Comparison and Recommendation , 2009, Polytech. Open Libr. Int. Bull. Inf. Technol. Sci..

[28]  Thomas Y. Lee Automatically Learning User Needs from Online Reviews for New Product Design , 2009, AMCIS.

[29]  John K. Debenham,et al.  Informed Recommender: Basing Recommendations on Consumer Product Reviews , 2007, IEEE Intelligent Systems.

[30]  M. Stubbs Text and Corpus Analysis: Computer-Assisted Studies of Language and Culture , 1996 .

[31]  J. Holmes,et al.  The handbook of language and gender , 2003 .

[32]  Gediminas Adomavicius,et al.  Context-aware recommender systems , 2008, RecSys '08.

[33]  Don A. Moore,et al.  Organizational Behavior and Human Decision Processes , 2019 .

[34]  Elena Karahanna,et al.  The Dark Side of Reviews: The Swaying Effects of Online Product Reviews on Attribute Preference Construction , 2017, MIS Q..

[35]  D. Hoffman,et al.  Consumer and Object Experience in the Internet of Things: An Assemblage Theory Approach , 2018 .

[36]  Bernardete Ribeiro,et al.  On using crowdsourcing and active learning to improve classification performance , 2011, 2011 11th International Conference on Intelligent Systems Design and Applications.

[37]  Christophe Diot,et al.  Finding a needle in a haystack of reviews: cold start context-based hotel recommender system , 2012, RecSys.

[38]  Patrick S. Forscher,et al.  Long-term reduction in implicit race bias: A prejudice habit-breaking intervention. , 2012, Journal of experimental social psychology.

[39]  Daniel Jurafsky,et al.  Extracting Social Meaning: Identifying Interactional Style in Spoken Conversation , 2009, NAACL.

[40]  Mahzarin R. Banaji,et al.  Implicit Bias among Physicians and its Prediction of Thrombolysis Decisions for Black and White Patients , 2007, Journal of General Internal Medicine.

[41]  P. Devine Stereotypes and prejudice: Their automatic and controlled components. , 1989 .

[42]  M. Graham,et al.  Science faculty’s subtle gender biases favor male students , 2012, Proceedings of the National Academy of Sciences.

[43]  A. Agresti An introduction to categorical data analysis , 1997 .

[44]  J. Pennebaker,et al.  Language use of depressed and depression-vulnerable college students , 2004 .

[45]  Michael Carl Tschantz,et al.  Automated Experiments on Ad Privacy Settings , 2014, Proc. Priv. Enhancing Technol..

[46]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[47]  Daniel Jurafsky,et al.  Word embeddings quantify 100 years of gender and ethnic stereotypes , 2017, Proceedings of the National Academy of Sciences.

[48]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[49]  Waldo Hasperué,et al.  The master algorithm: how the quest for the ultimate learning machine will remake our world , 2015 .

[50]  Bamshad Mobasher,et al.  Context-Aware Recommendation Based On Review Mining , 2011, ITWP@IJCAI.

[51]  Loren G. Terveen,et al.  CrowdLens: Experimenting with Crowd-Powered Recommendation and Explanation , 2016, ICWSM.

[52]  Gerhard Friedrich,et al.  Recommender Systems - An Introduction , 2010 .

[53]  B. Powell Land Systems Of British India , 1892 .