Comparison of Feature Extraction Methods and Predictors for Income Inference

Patterns of mobile phone communications, coupled with the information of the social network graph and financial behavior, allow us to make inferences of users' socio-economic attributes such as their income level. We present here several methods to extract features from mobile phone usage (calls and messages), and compare different combinations of supervised machine learning techniques and sets of features used as input for the inference of users' income. Our experimental results show that the Bayesian method based on the communication graph outperforms standard machine learning algorithms using node-based features.

[1]  D. Powers Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation , 2008 .

[2]  Stefan Schaal,et al.  Locally Weighted Regression for Control , 2010 .

[3]  Jingrui He,et al.  Comparing Random Forest with Logistic Regression for Predicting Class-Imbalanced Civil War Onset Data , 2016, Political Analysis.

[4]  José Ignacio Alvarez-Hamelin,et al.  Socioeconomic correlations and stratification in social-communication networks , 2016, Journal of The Royal Society Interface.

[5]  Eric Fleury,et al.  Correlations of consumption patterns in social-economic networks , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[6]  Hernán A. Makse,et al.  Inferring personal economic status from social network location , 2017, Nature Communications.

[7]  Carlos Sarraute,et al.  A Bayesian approach to income inference in a communication network , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).