User profiling in an ego network: co-profiling attributes and relationships

User attributes, such as occupation, education, and location, are important for many applications. In this paper, we study the problem of profiling user attributes in social network. To capture the correlation between attributes and social connections, we present a new insight that social connections are discriminatively correlated with attributes via a hidden factor -- relationship type. For example, a user's colleagues are more likely to share the same employer with him than other friends. Based on the insight, we propose to co-profile users' attributes and relationship types of their connections. To achieve co-profiling, we develop an efficient algorithm based on an optimization framework. Our algorithm captures our insight effectively. It iteratively profiles attributes by propagation via certain types of connections, and profiles types of connections based on attributes and the network structure. We conduct extensive experiments to evaluate our algorithm. The results show that our algorithm profiles various attributes accurately, which improves the state-of-the-art methods by 12%.

[1]  Foster Provost,et al.  A Simple Relational Classifier , 2003 .

[2]  Stanford,et al.  Learning to Discover Social Circles in Ego Networks , 2012 .

[3]  Hong Cheng,et al.  Graph Clustering Based on Structural/Attribute Similarities , 2009, Proc. VLDB Endow..

[4]  John D. Burger,et al.  Discriminating Gender on Twitter , 2011, EMNLP.

[5]  Jure Leskovec,et al.  Predicting positive and negative links in online social networks , 2010, WWW '10.

[6]  Lars Backstrom,et al.  Find me if you can: improving geographical prediction with social and spatial proximity , 2010, WWW '10.

[7]  John D. Burger,et al.  An Exploration of Observable Features Related to Blogger Age , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[8]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[9]  Ana-Maria Popescu,et al.  Democrats, republicans and starbucks afficionados: user classification in twitter , 2011, KDD.

[10]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[11]  Jon Oberlander,et al.  The Identity of Bloggers: Openness and Gender in Personal Weblogs , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[12]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[13]  Jiawei Han,et al.  Mining advisor-advisee relationships from research publication networks , 2010, KDD.

[14]  Ravi Kumar,et al.  "I know what you did last summer": query logs and user privacy , 2007, CIKM '07.

[15]  Micah Adler,et al.  Clustering Relational Data Using Attribute and Link Information , 2003 .

[16]  Krishna P. Gummadi,et al.  You are who you know: inferring user profiles in online social networks , 2010, WSDM '10.

[17]  Jie Tang,et al.  Learning to Infer Social Ties in Large Networks , 2011, ECML/PKDD.

[18]  Rui Wang,et al.  Towards social user profiling: unified and discriminative influence model for inferring home locations , 2012, KDD.

[19]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[21]  Ling Huang,et al.  Evolution of social-attribute networks: measurements, modeling, and implications using google+ , 2012, Internet Measurement Conference.

[22]  Jimeng Sun,et al.  Confluence: conformity influence in large social networks , 2013, KDD.

[23]  David Yarowsky,et al.  Classifying latent user attributes in twitter , 2010, SMUC '10.

[24]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Lise Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.