Learning to Discover Social Circles in Ego Networks

Our personal social networks are big and cluttered, and currently there is no good way to organize them. Social networking sites allow users to manually categorize their friends into social circles (e.g. ‘circles’ on Google+, and ‘lists’ on Facebook and Twitter), however they are laborious to construct and must be updated whenever a user’s network grows. We define a novel machine learning task of identifying users’ social circles. We pose the problem as a node clustering problem on a user’s ego-network, a network of connections between her friends. We develop a model for detecting circles that combines network structure as well as user profile information. For each circle we learn its members and the circle-specific user profile similarity metric. Modeling node membership to multiple circles allows us to detect overlapping as well as hierarchically nested circles. Experiments show that our model accurately identifies circles on a diverse set of data from Facebook, Google+, and Twitter for all of which we obtain hand-labeled ground-truth.

[1]  Morroe Berger,et al.  Freedom and control in modern society , 1954 .

[2]  L. Goddard Information Theory , 1962, Nature.

[3]  P. Lazarsfeld,et al.  Friendship as Social process: a substantive and methodological analysis , 1964 .

[4]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[5]  M. McPherson An Ecology of Affiliation , 1983 .

[6]  A. Raftery,et al.  Bayesian Information Criterion for Censored Survival Models , 2000, Biometrics.

[7]  Endre Boros,et al.  Pseudo-Boolean optimization , 2002, Discret. Appl. Math..

[8]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[9]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[10]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[12]  A. Raftery,et al.  Model‐based clustering for social networks , 2007 .

[13]  Vladimir Kolmogorov,et al.  Optimizing Binary MRFs via Extended Roof Duality , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[15]  David M. Blei,et al.  Connections between the lines: augmenting social networks with text , 2009, KDD.

[16]  Adrian E. Raftery,et al.  Representing degree distributions, clustering, and homophily in social networks with latent cluster random effects models , 2009, Soc. Networks.

[17]  Joachim M. Buhmann,et al.  Multi-assignment clustering for Boolean data , 2009, ICML '09.

[18]  David M. Blei,et al.  Relational Topic Models for Document Networks , 2009, AISTATS.

[19]  Yan Liu,et al.  Topic-link LDA: joint models of topic and author community , 2009, ICML '09.

[20]  Tetsuya Yoshida,et al.  Toward finding hidden communities based on user profile , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[21]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[22]  Krishna P. Gummadi,et al.  You are who you know: inferring user profiles in online social networks , 2010, WSDM '10.

[23]  Alice Oh,et al.  Analysis of Twitter Lists as a Potential Source for Discovering Latent Characteristics of Users , 2010 .

[24]  Charles Elkan,et al.  Link Prediction via Matrix Factorization , 2011, ECML/PKDD.

[25]  Sudha Ram,et al.  Examining the evolution of networks based on lists in Twitter , 2011, 2011 IEEE 5th International Conference on Internet Multimedia Systems Architecture and Application.

[26]  Peyman Nasirifard,et al.  Tadvise: A Twitter Assistant Based on Twitter Lists , 2011, SocInfo.

[27]  William W. Cohen,et al.  Block-LDA: Jointly Modeling Entity-Annotated Text and Entity-Entity Links , 2014, Handbook of Mixed Membership Models and Their Applications.

[28]  Padhraic Smyth,et al.  Dynamic Egocentric Models for Citation Networks , 2011, ICML.

[29]  Duncan J. Watts,et al.  Who says what to whom on twitter , 2011, WWW.

[30]  Lars Backstrom,et al.  The Anatomy of the Facebook Social Graph , 2011, ArXiv.

[31]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[32]  Jure Leskovec,et al.  Discovering social circles in ego networks , 2012, ACM Trans. Knowl. Discov. Data.