Thought leaders during crises in massive social networks

Making vast amounts of online social media data comprehensible to an analyst is a key question in operational analytics. Twitter and micro-blog conversations can easily be gathered from Internet services such as Spinn3r to create graphs representing the interactions between the entities in an online community that contains billions of vertices and tens of billions of edges. Graphs of this size can easily be represented in a modern laptop or workstation. The challenge lies in making them comprehensible. This paper focuses on methods to assemble social network graphs from online social media to reveal nodes that are ‘interesting’ in the context of operational analysis—meaning that the computational results can be interpreted by a human analyst wishing to answer some operational questions. Only metrics based on the structure of the graph are utilized, which avoid the challenges and costs involved in message content analysis. We further restrict ourselves to the use of metrics that are computational tractable on billion node graphs. The reported results demonstrate that nodes with a high impact or disproportionally large agency on the whole network (e.g., online community) can be found in a variety of online communities. Validation of the importance of these high-agency nodes by human and computational methods is discussed, and the efficacy of our approach by both quantitative methods and tests against the null hypothesis is reported. © 2012 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 5: 205–217, 2012 © 2012 Wiley Periodicals, Inc.

[1]  S. Havlin,et al.  Breakdown of the internet under intentional attack. , 2000, Physical review letters.

[2]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[3]  David A. Bader,et al.  An Experimental Study of A Parallel Shortest Path Algorithm for Solving Large-Scale Graph Instances , 2007, ALENEX.

[4]  M. Gell-Mann A Theory of Everything. (Book Reviews: The Quark and the Jaguar. Adventures in the Simple and the Complex.) , 1994 .

[5]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[6]  P. Lazarsfeld,et al.  6. Katz, E. Personal Influence: The Part Played by People in the Flow of Mass Communications , 1956 .

[7]  Mark M. Lowenthal,et al.  Intelligence: From Secrets to Policy , 2005 .

[8]  Noah E. Friedkin,et al.  Theoretical Foundations for Centrality Measures , 1991, American Journal of Sociology.

[9]  Mark Newman,et al.  Detecting community structure in networks , 2004 .

[10]  Albert-László Barabási,et al.  Error and attack tolerance of complex networks , 2000, Nature.

[11]  Christos Faloutsos,et al.  Spectral Analysis for Billion-Scale Graphs: Discoveries and Implementation , 2011, PAKDD.

[12]  Lada A. Adamic,et al.  The political blogosphere and the 2004 U.S. election: divided they blog , 2005, LinkKDD '05.

[13]  J. Cullum,et al.  Lanczos algorithms for large symmetric eigenvalue computations , 1985 .

[14]  Andrew J. Cowell,et al.  Social media and social reality , 2010, 2010 IEEE International Conference on Intelligence and Security Informatics.

[15]  M. Csíkszentmihályi Flow: The Psychology of Optimal Experience , 1990 .

[16]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[17]  L. Freeman Finding Social Groups: A Meta-Analysis of the Southern Women Data , 2003 .

[18]  A. Hunter,et al.  Foundations of multimethod research : synthesizing styles , 2006 .

[19]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[20]  Kurt Hornik,et al.  Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.

[21]  James M. Ortega,et al.  The LLT and QR methods for symmetric tridiagonal matrices , 1963, Comput. J..

[22]  Lev Muchnik,et al.  Identifying influential spreaders in complex networks , 2010, 1001.5285.

[23]  William N. Reynolds Breadth-depth triangulation for validation of modeling and simulation of complex systems , 2010, 2010 IEEE International Conference on Intelligence and Security Informatics.

[24]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Kathleen M. Carley,et al.  Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers , 2004 .

[26]  Alessandro Vespignani,et al.  Epidemic spreading in scale-free networks. , 2000, Physical review letters.

[27]  R. Hanneman Introduction to Social Network Methods , 2001 .

[28]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[29]  David A. Bader,et al.  Massive Social Network Analysis: Mining Twitter for Social Good , 2010, 2010 39th International Conference on Parallel Processing.

[30]  W. Neuman,et al.  Social Research Methods: Qualitative and Quantitative Approaches , 2002 .

[31]  Robert M. Farber,et al.  CUDA Application Design and Development , 2011 .