Inferring nationalities of Twitter users and studying inter-national linking

Twitter user profiles contain rich information that allows researchers to infer particular attributes of users' identities. Knowing identity attributes such as gender, age, and/or nationality are a first step in many studies which seek to describe various phenomena related to computational social science. Often, it is through such attributes that studies of social media that focus on, for example, the isolation of foreigners, become possible. However, such characteristics are not often clearly stated by Twitter users, so researchers must turn to other means to ascertain various categories of identity. In this paper, we discuss the challenge of detecting the nationality of Twitter users using rich features from their profiles. In addition, we look at the effectiveness of different features as we go about this task. For the case of a highly diverse country---Qatar---we provide a detailed network analysis with insights into user behaviors and linking preference (or the lack thereof) to other nationalities.

[1]  Daniele Quercia,et al.  Tracking "gross community happiness" from tweets , 2012, CSCW.

[2]  Daniel Gatica-Perez,et al.  Speaking swiss: languages and venues in foursquare , 2013, MM '13.

[3]  John D. Burger,et al.  Discriminating Gender on Twitter , 2011, EMNLP.

[4]  J. Friedman Stochastic gradient boosting , 2002 .

[5]  David Yarowsky,et al.  Classifying latent user attributes in twitter , 2010, SMUC '10.

[6]  Cheng Li,et al.  When a friend in Twitter is a friend in life , 2012, WebSci '12.

[7]  Christopher M. Danforth,et al.  The Geography of Happiness: Connecting Twitter Sentiment and Expression, Demographics, and Objective Characteristics of Place , 2013, PloS one.

[8]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[9]  Daniele Quercia,et al.  Cultural Dimensions in Twitter: Time, Individualism and Power , 2013, ICWSM.

[10]  I. Phillip Young,et al.  Asian, Hispanic, and Native American Job Candidates: Prescreened or Screened within the Selection Process , 2002 .

[11]  Ed H. Chi,et al.  Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles , 2011, CHI.

[12]  Megha Agrawal,et al.  Characterizing Geographic Variation in Well-Being Using Tweets , 2013, ICWSM.

[13]  P. Bourdieu Distinction: A Social Critique of the Judgement of Taste* , 2018, Food and Culture.

[14]  Wendy Liu,et al.  Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors , 2012, ICWSM.

[15]  Jeffrey Nichols,et al.  Where Is This Tweet From? Inferring Home Locations of Twitter Users , 2012, ICWSM.

[16]  Sune Lehmann,et al.  Understanding the Demographics of Twitter Users , 2011, ICWSM.

[17]  Daniele Quercia,et al.  Talk of the City: Our Tweets, Our Community Happiness , 2012, ICWSM.

[18]  Barbara Poblete,et al.  Do all birds tweet the same?: characterizing twitter around the world , 2011, CIKM '11.

[19]  Ana-Maria Popescu,et al.  A Machine Learning Approach to Twitter User Classification , 2011, ICWSM.

[20]  C. Lee Giles,et al.  Name-Ethnicity Classification and Ethnicity-Sensitive Name Matching , 2012, AAAI.