Googling Social Interactions: Web Search Engine Based Social Network Construction

Social network analysis has long been an untiring topic of sociology. However, until the era of information technology, the availability of data, mainly collected by the traditional method of personal survey, was highly limited and prevented large-scale analysis. Recently, the exploding amount of automatically generated data has completely changed the pattern of research. For instance, the enormous amount of data from so-called high-throughput biological experiments has introduced a systematic or network viewpoint to traditional biology. Then, is “high-throughput” sociological data generation possible? Google, which has become one of the most influential symbols of the new Internet paradigm within the last ten years, might provide torrents of data sources for such study in this (now and forthcoming) digital era. We investigate social networks between people by extracting information on the Web and introduce new tools of analysis of such networks in the context of statistical physics of complex systems or socio-physics. As a concrete and illustrative example, the members of the 109th United States Senate are analyzed and it is demonstrated that the methods of construction and analysis are applicable to various other weighted networks.

[1]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[2]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[3]  Bernard Derrida,et al.  Statistical properties of randomly broken objects and of multivalley structures in disordered systems , 1987 .

[4]  John Scott What is social network analysis , 2010 .

[5]  Varga,et al.  Universal classification scheme for the spatial-localization properties of one-particle states in finite, d-dimensional systems. , 1992, Physical review. A, Atomic, molecular, and optical physics.

[6]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[7]  British Ornithologists,et al.  Bulletin of the , 1999 .

[8]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[9]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[10]  S. N. Dorogovtsev,et al.  Evolution of networks , 2001, cond-mat/0106144.

[11]  M. V. SIMKIN,et al.  Theory of Aces: Fame by Chance or Merit? , 2003 .

[12]  Karol Zyczkowski,et al.  Rényi Extrapolation of Shannon Entropy , 2003, Open Syst. Inf. Dyn..

[13]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[14]  Alessandro Vespignani,et al.  Weighted evolving networks: coupling topology and weight dynamics. , 2004, Physical review letters.

[15]  A. Barabasi,et al.  Global organization of metabolic fluxes in the bacterium Escherichia coli , 2004, Nature.

[16]  Steve Lawrence,et al.  Extracting knowledge from the World Wide Web , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Hawoong Jeong,et al.  Scale-free trees: the skeletons of complex networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Olivier Ferret,et al.  Discovering word senses from a network of lexical cooccurrences , 2004, COLING.

[19]  A. Vespignani,et al.  The architecture of complex weighted networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Kent A. Spackman,et al.  Using co-occurrence network structure to extract synonymous gene and protein names from MEDLINE abstracts , 2005, BMC Bioinformatics.

[21]  Alessandro Vespignani,et al.  Characterization and modeling of weighted networks , 2005 .

[22]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Mark Gerstein,et al.  Data Mining on the Web , 2006, Science.

[25]  Mason A. Porter,et al.  Community Structure in the United States House of Representatives , 2007, ArXiv.

[26]  M. Newman,et al.  Community structure in the U.S. House of Representatives. , 2006, Chaos.

[27]  Wendy Hall,et al.  Creating a Science of the Web , 2006, Science.

[28]  Monika Henzinger,et al.  Search Technologies for the Internet , 2007, Science.

[29]  L. Barnett,et al.  Spatially embedded random networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Peter M. Todd,et al.  Different cognitive processes underlie human mate choices and mate preferences , 2007, Proceedings of the National Academy of Sciences.

[31]  Yi-Cheng Zhang,et al.  Bipartite network projection and personal recommendation. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  Markus J. Herrgård,et al.  Network-based prediction of human tissue-specific metabolism , 2008, Nature Biotechnology.

[33]  M. Waldrop,et al.  Community cleverness required , 2008, Nature.

[34]  Amanda L. Traud,et al.  Community Structure in Congressional Cosponsorship Networks , 2007, 0708.1191.

[35]  Wendy Hall,et al.  Creating a Science of the Web , 2006, Science.

[36]  Tom A. B. Snijders,et al.  Social Network Analysis , 2011, International Encyclopedia of Statistical Science.

[37]  Susan Palwick,et al.  Fortune , 2011, Annals of Internal Medicine.