Web Science 2.0: Identifying Trends through Semantic Social Network Analysis

We introduce a novel set of social network analysis based algorithms for mining the Web, blogs, and online forums to identify trends and find the people launching these new trends. These algorithms have been implemented in Condor, a software system for predictive search and analysis of the Web and especially social networks. Algorithms include the temporal computation of network centrality measures, the visualization of social networks as Cybermaps, a semantic process of mining and analyzing large amounts of text based on social network analysis, and sentiment analysis and information filtering methods. The temporal calculation of betweenness of concepts permits to extract and predict long-term trends on the popularity of relevant concepts such as brands, movies, and politicians. We illustrate our approach by qualitatively comparing Web buzz and our Web betweenness for the 2008 US presidential elections, as well as correlating the Web buzz index with share prices.

[1]  Don Tapscott,et al.  Wikinomics: How Mass Collaboration Changes Everything , 2006 .

[2]  A. L. Jones,et al.  Have internet message boards changed market behavior , 2006 .

[3]  M. Dodge,et al.  Mapping Cyberspace , 2000 .

[4]  John Scott What is social network analysis , 2010 .

[5]  Murray Z. Frank,et al.  Internet Stock Message Boards and Stock Returns , 2002 .

[6]  James Felton,et al.  Warnings from the Enron Message Board , 2002 .

[7]  Tom A. B. Snijders,et al.  Social Network Analysis , 2011, International Encyclopedia of Statistical Science.

[8]  Robert F. Whitelaw,et al.  News or Noise? Internet Postings and Stock Prices , 2001 .

[9]  Mike Y. Chen,et al.  Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web , 2001 .

[10]  Carol Ward The news on noise , 2010 .

[11]  David M. Pennock,et al.  The structure of broad topics on the web , 2002, WWW.

[12]  Peter A. Gloor,et al.  Coolhunting: Chasing Down the Next Big Thing , 2007 .

[13]  Munmun De Choudhury,et al.  Can blog communication dynamics be correlated with stock market activity? , 2008, Hypertext.

[14]  Matthew Hurst,et al.  BlogPulse: Automated Trend Discovery for Weblogs , 2003 .

[15]  Yan Zhao,et al.  Analyzing Actors and Their Discussion Topics by Semantic Social Network Analysis , 2006, Tenth International Conference on Information Visualisation (IV'06).

[16]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[17]  Werner Antweiler,et al.  Is All that Talk Just Noise? The Information Content of Internet Stock Message Boards , 2001 .

[18]  Ilia D. Dichev News or Noise? , 2001 .

[19]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[20]  Bing Liu,et al.  Visualizing web site comparisons , 2002, WWW '02.

[21]  Sofus A. Macskassy,et al.  More than Words: Quantifying Language to Measure Firms' Fundamentals the Authors Are Grateful for Assiduous Research Assistance from Jie Cao and Shuming Liu. We Appreciate Helpful Comments From , 2007 .

[22]  Paul C. Tetlock Giving Content to Investor Sentiment: The Role of Media in the Stock Market , 2005, The Journal of Finance.

[23]  Paul C. Tetlock Giving Content to Investor Sentiment: The Role of Media in the Stock Market , 2005, The Journal of Finance.

[24]  Peter D. Wysocki Cheap Talk on the Web: The Determinants of Postings on Stock Message Boards , 1998 .

[25]  Werner Antweiler,et al.  Does Talk Matter? Evidence From a Broad Cross Section of Stocks , 2004 .

[26]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[27]  R. Kitchin,et al.  The Atlas of Cyberspace , 2001 .

[28]  Peter A. Gloor,et al.  Swarm Creativity: Competitive Advantage Through Collaborative Innovation Networks , 2006 .

[29]  Peter A. Gloor,et al.  TeCFlow – A Temporal Communication Flow Visualizer for Social Network Analysis , 2004 .

[30]  Takeshi Fukuda,et al.  Mining Structured Association Patterns from Databases , 2000, PAKDD.

[31]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[32]  Sougata Mukherjea,et al.  Organizing topic-specific web information , 2000, HYPERTEXT '00.

[33]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1998, SODA '98.

[34]  Robert F. Whitelaw,et al.  News or Noise ? Internet Message Board Activity and Stock Prices * , 2000 .

[35]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[36]  L. V. Williams,et al.  Prediction Markets , 2003 .

[37]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[38]  Eoin Whelan,et al.  Exploring knowledge exchange in electronic networks of practice , 2007, J. Inf. Technol..

[39]  Eric Zitzewitz,et al.  Price Discovery Among the Punters: Using New Financial Betting Markets to Predict Intraday Volatility , 2006 .

[40]  Eytan Adar,et al.  Implicit Structure and the Dynamics of Blogspace , 2004 .

[41]  Gary King,et al.  Extracting Systematic Social Science Meaning from Text 1 , 2007 .

[42]  David M. Pennock,et al.  Winners don't take all: Characterizing the competition for links on the web , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Peter A. Gloor,et al.  Capturing team dynamics through temporal social surfaces , 2005, Ninth International Conference on Information Visualisation (IV'05).

[44]  Albert-László Barabási,et al.  Linked - how everything is connected to everything else and what it means for business, science, and everyday life , 2003 .

[45]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.