Estimation of Twitter user's nationality based on friends and followers information

Abstract Big Data has become very useful in many fields since it provides answers to many important questions that can significantly enhance decision making and process optimization. One of the most interesting domains in big data is the prediction of human features, facts and behaviors. In this paper a new and effective algorithm to predict the nationality of Twitter users is proposed. The proposed algorithm tries to prognosticate the Twitter user's location from their friend's location information only without needing GPS information. Although only approximately 30% of Twitter users write their location information in meaningful form, this paper proves that this percentage is enough to determine the nationality of any Twitter user correctly. The proposed algorithm is applied to estimate the thresholds that will be used to determine the nationality of Twitter users. The results show that our algorithm can correctly classify an average of 90% of the Twitter users.

[1]  Gregory J. Park,et al.  Predicting Dark Triad Personality Traits from Twitter Usage and a Linguistic Analysis of Tweets , 2012, 2012 11th International Conference on Machine Learning and Applications.

[2]  Taghi M. Khoshgoftaar,et al.  Using Twitter Content to Predict Psychopathy , 2012, 2012 11th International Conference on Machine Learning and Applications.

[3]  Puneet Singh Ludu Inferring gender of a Twitter user using celebrities it follows , 2014, ArXiv.

[4]  Wolfgang Wahlster,et al.  New Horizons for a Data-Driven Economy , 2016, Springer International Publishing.

[5]  Jennifer Golbeck,et al.  Predicting Personality from Twitter , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[6]  Michael Trusov,et al.  Determining Influential Users in Internet Social Networks , 2010 .

[7]  Dong Nguyen,et al.  Why Gender and Age Prediction from Tweets is Hard: Lessons from a Crowdsourcing Experiment , 2014, COLING.

[8]  D. Rao Detecting Latent User Properties in Social Media , 2010 .

[9]  David Allen,et al.  Geotagging one hundred million Twitter accounts with total variation minimization , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[10]  Jeffrey Nichols,et al.  Home Location Identification of Twitter Users , 2014, TIST.

[11]  Taghi M. Khoshgoftaar,et al.  Machine prediction of personality from Facebook profiles , 2012, 2012 IEEE 13th International Conference on Information Reuse & Integration (IRI).

[12]  Matthew Richardson,et al.  Yes, there is a correlation: - from social networks to personal behavior on the web , 2008, WWW.

[13]  Jiebo Luo,et al.  The Eyes of the Beholder: Gender Prediction Using Images Posted in Online Social Networks , 2014, 2014 IEEE International Conference on Data Mining Workshop.

[14]  Ricardo Buettner,et al.  Getting a Job via Career-Oriented Social Networking Sites: The Weakness of Ties , 2016, 2016 49th Hawaii International Conference on System Sciences (HICSS).