A Hybrid Model for Role-related User Classification on Twitter

To aid a variety of research studies, we propose TWIROLE, a hybrid model for role-related user classification on Twitter, which detects male-related, female-related, and brand-related (i.e., organization or institution) users. TWIROLE leverages features from tweet contents, user profiles, and profile images, and then applies our hybrid model to identify a user's role. To evaluate it, we used two existing large datasets about Twitter users, and conducted both intra- and inter-comparison experiments. TWIROLE outperforms existing methods and obtains more balanced results over the several roles. We also confirm that user names and profile images are good indicators for this task. Our research extends prior work that does not consider brand-related users, and is an aid to future evaluation efforts relative to investigations that rely upon self-labeled datasets.

[1]  D. Ruths,et al.  What's in a Name? Using First Names as Features for Gender Inference in Twitter , 2013, AAAI Spring Symposium: Analyzing Microtext.

[2]  David Yarowsky,et al.  Broadly Improving User Classification via Communication-Based Name and Location Clustering on Twitter , 2013, NAACL.

[3]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Balachander Krishnamurthy,et al.  A few chirps about twitter , 2008, WOSN '08.

[6]  Fernando Batista,et al.  Twitter gender classification using user unstructured information , 2015, 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[7]  Derek Ruths,et al.  Gender Inference of Twitter Users in Non-English Contexts , 2013, EMNLP.

[8]  Faiyaz Al Zamal,et al.  Using Social Media to Infer Gender Composition of Commuter Populations , 2012, Proceedings of the International AAAI Conference on Web and Social Media.

[9]  Wendy Liu,et al.  Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors , 2012, ICWSM.

[10]  Scott Fortmann-Roe Effects of hue, saturation, and brightness on color preference in social networks: Gender‐based color preference on the social networking site Twitter , 2013 .

[11]  David Yarowsky,et al.  Classifying latent user attributes in twitter , 2010, SMUC '10.

[12]  Xiaojun Ma,et al.  Gender estimation for SNS user profiling using automatic image annotation , 2014, 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[13]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[14]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[15]  Tal Hassner,et al.  Age and gender classification using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[16]  Xiaojun Ma,et al.  Twitter User Gender Inference Using Combined Analysis of Text and Image Processing , 2014, VL@COLING.

[17]  Philip S. Yu,et al.  Language independent gender classification on Twitter , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[18]  D. Lasorsa TRANSPARENCY AND OTHER JOURNALISTIC NORMS ON TWITTER , 2012 .

[19]  Ke Zhang,et al.  Soft Biometrics in Online Social Networks: A Case Study on Twitter User Gender Recognition , 2017, 2017 IEEE Winter Applications of Computer Vision Workshops (WACVW).

[20]  John D. Burger,et al.  Discriminating Gender on Twitter , 2011, EMNLP.

[21]  Jiebo Luo,et al.  Deciphering the 2016 U.S. Presidential Campaign in the Twitter Sphere: A Comparison of the Trumpists and Clintonists , 2016, ICWSM.

[22]  Yang Feng,et al.  Voting with Feet: Who are Leaving Hillary Clinton and Donald Trump , 2016, 2016 IEEE International Symposium on Multimedia (ISM).

[23]  Susan T. Dumais,et al.  Characterizing Microblogs with Topic Models , 2010, ICWSM.

[24]  Zachary Miller,et al.  Gender Prediction on Twitter Using Stream Algorithms with N-Gram Character Features , 2012 .

[25]  C. Artwick,et al.  News sourcing and gender on Twitter , 2014 .

[26]  Ana-Maria Popescu,et al.  A Machine Learning Approach to Twitter User Classification , 2011, ICWSM.

[27]  Yong-Yeol Ahn,et al.  Twitter's Glass Ceiling: The Effect of Perceived Gender on Online Visibility , 2016, ICWSM.

[28]  Philip S. Yu,et al.  Empirical Evaluation of Profile Characteristics for Gender Classification on Twitter , 2013, 2013 12th International Conference on Machine Learning and Applications.

[29]  Virgílio A. F. Almeida,et al.  A gender based study of tagging behavior in twitter , 2012, HT '12.

[30]  Dong Nguyen,et al.  Why Gender and Age Prediction from Tweets is Hard: Lessons from a Crowdsourcing Experiment , 2014, COLING.

[31]  Fusheng Wang,et al.  A Comparative Study of Demographic Attribute Inference in Twitter , 2015, ICWSM.

[32]  Eduard H. Hovy,et al.  Weakly Supervised User Profile Extraction from Twitter , 2014, ACL.

[33]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[34]  Brendan T. O'Connor,et al.  Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters , 2013, NAACL.

[35]  Zachary Miller,et al.  Gender Identification on Twitter Using the Modified Balanced Winnow , 2012 .

[36]  Clayton Fink,et al.  Inferring Gender from the Content of Tweets: A Region Specific Example , 2012, ICWSM.

[37]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[38]  Dong Nguyen,et al.  "How Old Do You Think I Am?" A Study of Language and Age in Twitter , 2013, ICWSM.