Yes, there is a correlation: - from social networks to personal behavior on the web

Characterizing the relationship that exists between a person's social group and his/her personal behavior has been a long standing goal of social network analysts. In this paper, we apply data mining techniques to study this relationship for a population of over 10 million people, by turning to online sources of data. The analysis reveals that people who chat with each other (using instant messaging) are more likely to share interests (their Web searches are the same or topically similar). The more time they spend talking, the stronger this relationship is. People who chat with each other are also more likely to share other personal characteristics, such as their age and location (and, they are likely to be of opposite gender). Similar findings hold for people who do not necessarily talk to each other but do have a friend in common. Our analysis is based on a well-defined mathematical formulation of the problem, and is the largest such study we are aware of.

[1]  A. Raftery,et al.  Model‐based clustering for social networks , 2007 .

[2]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[3]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[4]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[5]  Matt J Keeling,et al.  Monogamous networks and the spread of sexually transmitted diseases. , 2004, Mathematical biosciences.

[6]  Jon Kleinberg,et al.  Maximizing the spread of influence through a social network , 2003, KDD '03.

[7]  Jure Leskovec,et al.  Planetary-scale views on a large instant-messaging network , 2008, WWW.

[8]  John Scott What is social network analysis , 2010 .

[9]  Steve Chien,et al.  Semantic similarity between search engine queries using temporal correlation , 2005, WWW '05.

[10]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[11]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[12]  Lee Sproull,et al.  Making information cities livable , 2004, CACM.

[13]  C. Lee Giles,et al.  Efficient identification of Web communities , 2000, KDD '00.

[14]  P. Bearman,et al.  Chains of Affection: The Structure of Adolescent Romantic and Sexual Networks1 , 2004, American Journal of Sociology.

[15]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[16]  Matthew Richardson,et al.  The Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank , 2001, NIPS.

[17]  J. Davenport Editor , 1960 .

[18]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .