Predicting individual socioeconomic status from mobile phone data: a semi-supervised hypergraph-based factor graph approach

Socioeconomic status (SES) is an important economic and social aspect widely concerned. Assessing individual SES can assist related organizations in making a variety of policy decisions. Traditional approach suffers from the extremely high cost in collecting large-scale SES-related survey data. With the ubiquity of smart phones, mobile phone data has become a novel data source for predicting individual SES with low cost. However, the task of predicting individual SES on mobile phone data also proposes some new challenges, including sparse individual records, scarce explicit relationships and limited labeled samples, unconcerned in prior work restricted to regional or household-oriented SES prediction. To address these issues, we propose a semi-supervised hypergraph-based factor graph model (HyperFGM) for individual SES prediction. HyperFGM is able to efficiently capture the associations between SES and individual mobile phone records to handle the individual record sparsity. For the scarce explicit relationships, HyperFGM models implicit high-order relationships among users on the hypergraph structure. Besides, HyperFGM explores the limited labeled data and unlabeled data in a semi-supervised way. Experimental results show that HyperFGM greatly outperforms the baseline methods on a set of anonymized real mobile phone data for individual SES prediction.

[1]  Lingzi Hong,et al.  Topic Models to Infer Socio-Economic Maps , 2016, AAAI.

[2]  Balaraman Ravindran,et al.  Extended Discriminative Random Walk: A Hypergraph Approach to Multi-View Multi-Relational Transductive Learning , 2015, IJCAI.

[3]  Yue Gao,et al.  Vertex-Weighted Hypergraph Learning for Multi-View Object Classification , 2017, IJCAI.

[4]  Jason H. Moore,et al.  Multiple Threshold Spatially Uniform ReliefF for the Genetic Analysis of Complex Human Diseases , 2013, EvoBIO.

[5]  Luis Mario Floría,et al.  Rich do not rise early: spatio-temporal patterns in the mobility networks of different socio-economic classes , 2016, Royal Society Open Science.

[6]  Randal S. Olson,et al.  Benchmarking Relief-Based Feature Selection Methods , 2017, ArXiv.

[7]  Jie Tang,et al.  Learning to Infer Social Ties in Large Networks , 2011, ECML/PKDD.

[8]  Meng Wang,et al.  Adaptive Hypergraph Learning and its Application in Image Classification , 2012, IEEE Transactions on Image Processing.

[9]  Ravishankar K. Iyer,et al.  EEG-GRAPH: A Factor-Graph-Based Model for Capturing Spatial, Temporal, and Observational Relationships in Electroencephalograms , 2017, NIPS.

[10]  Qunying Huang,et al.  Activity patterns, socioeconomic status and urban spatial structure: what can social media data tell us? , 2016, Int. J. Geogr. Inf. Sci..

[11]  Víctor Soto,et al.  Prediction of socioeconomic levels using cell phone records , 2011, UMAP'11.

[12]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[13]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14]  Randal S. Olson,et al.  Benchmarking Relief-Based Feature Selection Methods , 2017, J. Biomed. Informatics.

[15]  Yue Gao,et al.  3-D Object Retrieval and Recognition With Hypergraph Analysis , 2012, IEEE Transactions on Image Processing.

[16]  - .,et al.  Travel patterns and environmental effects now and in the future : implications of differences in energy consumption among socio-economic groups , 1999 .

[17]  Dana S. Scott,et al.  Finite Automata and Their Decision Problems , 1959, IBM J. Res. Dev..

[18]  Gabriel Cadamuro,et al.  Predicting poverty and wealth from mobile phone metadata , 2015, Science.

[19]  Johan Bollen,et al.  Quantifying socio-economic indicators in developing countries from mobile phone communication data: applications to Côte d’Ivoire , 2015, EPJ Data Science.

[20]  J. R. Warren,et al.  4. Socioeconomic Indexes for Occupations: A Review, Update, and Critique , 1997 .

[21]  S. Fortmann,et al.  Socioeconomic status and health: how education, income, and occupation contribute to risk factors for cardiovascular disease. , 1992, American journal of public health.

[22]  Ingemar J. Cox,et al.  Inferring the Socioeconomic Status of Social Media Users Based on Behaviour and Language , 2016, ECIR.

[23]  C. Propper,et al.  Impact of patients' socioeconomic status on the distance travelled for hospital admission in the English National Health Service , 2007, Journal of health services research & policy.

[24]  Ido Dagan,et al.  Similarity-Based Methods for Word Sense Disambiguation , 1997, ACL.

[25]  S. Folkman,et al.  Socioeconomic Status and Health , 1994 .

[26]  O. D. Duncan,et al.  The American Occupational Structure , 1967 .

[27]  Qingshan Liu,et al.  Image retrieval via probabilistic hypergraph ranking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  Licia Capra,et al.  Poverty on the cheap: estimating poverty maps using aggregated mobile communication networks , 2014, CHI.

[29]  Selcuk R. Sirin Socioeconomic Status and Academic Achievement: A Meta-Analytic Review of Research , 2005 .

[30]  Bernhard Schölkopf,et al.  Learning with Hypergraphs: Clustering, Classification, and Embedding , 2006, NIPS.

[31]  Xing Xie,et al.  Mining Individual Life Pattern Based on Location History , 2009, 2009 Tenth International Conference on Mobile Data Management: Systems, Services and Middleware.

[32]  Marie-Francine Moens,et al.  Forecasting Potential Diabetes Complications , 2014, AAAI.

[33]  Anne Wilcock,et al.  Consumer attitudes, knowledge and behaviour: a review of food safety issues , 2004 .