How Are You Related? Predicting the Type of a Social Relationship Using Call Graph Data

Social relationships defined by phone calls made between people can be grouped into various relationship types or categories, such as family members, co-workers, etc. We propose and evaluate a method that predicts the "relationship type" between a pair of mobile phone subscribers using features that abstract their communication behavior and social network patterns. Our dataset consists of call detail records of a major wireless carrier sampled from four demographically diverse regions, from which we built a directed social graph, with over 200,000 vertices and 400,000 edges. Using account and subscription plan information, we labeled each edge in the graph as one of the following four relationships: family, co-worker, customer and service. Our analysis of the dataset shows that these four relationship types exhibit distinct communication behavior patterns and generate characteristic topological features on the social network surrounding the pairs. For instance, subscriber pairs with a family relationship generate high average number of calls, have low call duration, call more frequently and share more mutual contacts than pairs with a service or co-worker relationship. Using a set of features that abstract these characteristics and the Random Forest supervised machine learning classifier, we demonstrate that it is possible to predict the relationship type between a subscriber pair with an accuracy of 87%.

[1]  Sougata Mukherjea,et al.  On the structural properties of massive telecom call graphs: findings and implications , 2006, CIKM '06.

[2]  David Lazer,et al.  Inferring friendship network structure by using mobile phone data , 2009, Proceedings of the National Academy of Sciences.

[3]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[4]  Rami Puzis,et al.  Link Prediction in Social Networks Using Computationally Efficient Topological Features , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[5]  A-L Barabási,et al.  Structure and tie strengths in mobile communication networks , 2006, Proceedings of the National Academy of Sciences.

[6]  Sanford Weisberg,et al.  An R Companion to Applied Regression , 2010 .

[7]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[8]  Aric Hagberg,et al.  Exploring Network Structure, Dynamics, and Function using NetworkX , 2008, Proceedings of the Python in Science Conference.

[9]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[10]  Ankur Teredesai,et al.  Extracting Social Networks from Instant Messaging Populations , 2004 .

[11]  John Fox,et al.  Robust Regression in R An Appendix to An R Companion to Applied Regression, Second Edition , 2011 .

[12]  Rich Caruana,et al.  An empirical evaluation of supervised learning in high dimensions , 2008, ICML '08.

[13]  Christos Faloutsos,et al.  Mobile call graphs: beyond power-law and lognormal distributions , 2008, KDD.