Recognizing users gender in social media using linguistic features

Web 2.0 and social media provide users with an opportunity to discuss and share opinions, as a result, a considerable amount of information will emerge which can be drawn upon to determine some demographic and behavioral features.This study is an attempt to predict gender, as a demographic feature, using linguistic features of data collected from the users' comments in the social media.For this purpose, a framework is proposed to predict the users' gender by counting the number of some given words including verbs, pronouns, articles, adjectives, adverbs, preposition and numbers. This framework, thereafter, was tested using the comments that readers of Los Angeles Times left and the model were observed to predict the gender with an accuracy of 66.66%. Security solution and e-marketing can use this framework respectively for authentication and niche marketing. A framework is proposed to predict the users' gender in social media.We used the relationship between some linguistic features and users' gender in the framework.This framework was tested using the comments that readers of Los Angeles Times left.The model was observed to predict the gender with an accuracy of 66.66%.

[1]  Walter Daelemans,et al.  Predicting age and gender in online social networks , 2011, SMUC '11.

[2]  San Murugesan,et al.  Understanding Web 2.0 , 2007, IT Professional.

[3]  Xi Zhang,et al.  Mapping development of social media research through different disciplines: Collaborative learning in management and computer science , 2015, Comput. Hum. Behav..

[4]  Tim O'Reilly,et al.  What is Web 2.0: Design Patterns and Business Models for the Next Generation of Software , 2007 .

[5]  Hsinchun Chen,et al.  Research note: Examining gender emotional differences in Web forum communication , 2013, Decis. Support Syst..

[6]  Xi Zhang,et al.  The impact of second life on team learning outcomes from the perspective of it capabilities , 2012 .

[7]  Berkant Barla Cambazoglu,et al.  Chat mining: Predicting user and message attributes in computer-mediated communication , 2008, Inf. Process. Manag..

[8]  Leslie G. Valiant,et al.  Short Monotone Formulae for the Majority Function , 1984, J. Algorithms.

[9]  Xi Zhang,et al.  Understanding the users' continuous adoption of 3D social virtual world in China: A comparative case study , 2014, Comput. Hum. Behav..

[10]  Carla J. Groom,et al.  Gender Differences in Language Use: An Analysis of 14,000 Text Samples , 2008 .

[11]  Rosanna E. Guadagno,et al.  Make new friends or keep the old: Gender and personality differences in social networking use , 2012, Comput. Hum. Behav..

[12]  James Smith,et al.  Gender Prediction in Social Media , 2014, ArXiv.

[13]  Teresa Correa,et al.  Who interacts on the Web?: The intersection of users' personality and social media use , 2010, Comput. Hum. Behav..

[14]  Shlomo Argamon,et al.  Mining the Blogosphere: Age, gender and the varieties of self-expression , 2007, First Monday.

[15]  Xi Zhang,et al.  Effects of information technologies, department characteristics and individual roles on improving knowledge sharing visibility: a qualitative case study , 2012, Behav. Inf. Technol..

[16]  Mark J. Brosnan,et al.  Publically different, privately the same: Gender differences and similarities in response to Facebook status updates , 2014, Comput. Hum. Behav..

[17]  Martin Cave,et al.  Is Symmetric Access Regulation a Policy Choice? Evidence from the Deployment of NGA in Europe , 2015 .

[18]  Yongqiang Sun,et al.  Location information disclosure in location-based social network services: Privacy calculus, benefit structure, and gender differences , 2015, Comput. Hum. Behav..

[19]  Xi Zhang,et al.  Effect of knowledge sharing visibility on incentive-based relationship in Electronic Knowledge Management Systems: An empirical investigation , 2013, Comput. Hum. Behav..

[20]  Asta Bäck,et al.  Social Media Roadmaps: Exploring the futures triggered by social media , 2008 .

[21]  Patricia Ordóñez de Pablos,et al.  Culture effects on the knowledge sharing in multi-national virtual classes: A mixed method , 2014, Comput. Hum. Behav..

[22]  Benjamin C. M. Fung,et al.  A unified data mining solution for authorship analysis in anonymous textual communications , 2013, Inf. Sci..

[23]  Xi Zhang,et al.  From e-learning to social-learning: Mapping development of studies on social media-supported knowledge management , 2015, Comput. Hum. Behav..

[24]  Berkant Barla Cambazoglu,et al.  Chat Mining for Gender Prediction , 2006, ADVIS.

[25]  France Bélanger,et al.  Gender differences in perceptions of web-based shopping , 2002, CACM.

[26]  Wendy Harcourt The Personal and the Political: Women Using the Internet , 2000, Cyberpsychology Behav. Soc. Netw..

[27]  David Bamman,et al.  Gender identity and lexical variation in social media , 2012, 1210.4567.

[28]  Arjun Mukherjee,et al.  Improving Gender Classification of Blog Authors , 2010, EMNLP.

[29]  Magali Miche,et al.  Gender differences in graphic design for the Web , 2006 .

[30]  Luiz Eduardo Soares de Oliveira,et al.  Author Identification using Stylometric Features , 2007, Inteligencia Artif..