The Atlas of Chinese World Wide Web Ecosystem Shaped by the Collective Attention Flows

The web can be regarded as an ecosystem of digital resources connected and shaped by collective successive behaviors of users. Knowing how people allocate limited attention on different resources is of great importance. To answer this, we embed the most popular Chinese web sites into a high dimensional Euclidean space based on the open flow network model of a large number of Chinese users’ collective attention flows, which both considers the connection topology of hyperlinks between the sites and the collective behaviors of the users. With these tools, we rank the web sites and compare their centralities based on flow distances with other metrics. We also study the patterns of attention flow allocation, and find that a large number of web sites concentrate on the central area of the embedding space, and only a small fraction of web sites disperse in the periphery. The entire embedding space can be separated into 3 regions(core, interim, and periphery). The sites in the core (1%) occupy a majority of the attention flows (40%), and the sites (34%) in the interim attract 40%, whereas other sites (65%) only take 20% flows. What’s more, we clustered the web sites into 4 groups according to their positions in the space, and found that similar web sites in contents and topics are grouped together. In short, by incorporating the open flow network model, we can clearly see how collective attention allocates and flows on different web sites, and how web sites connected each other.

[1]  Sunil Gupta,et al.  Choice and the Internet: From Clickstream to Research Stream , 2002 .

[2]  Dirk Helbing,et al.  An individual-based model of collective attention , 2009 .

[3]  A. Vespignani,et al.  Competition among memes in a world with limited attention , 2012, Scientific Reports.

[4]  Jon Kleinberg,et al.  Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter , 2011, WWW.

[5]  Gabrielle Demange,et al.  Collective attention and ranking methods , 2013 .

[6]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[7]  Min Zhao,et al.  The Metabolism and Growth of Web Forums , 2013, PloS one.

[8]  Barry Wellman,et al.  Geography of Twitter networks , 2012, Soc. Networks.

[9]  Albert-László Barabási,et al.  The origin of bursts and heavy tails in human dynamics , 2005, Nature.

[10]  Kazuyuki Aihara,et al.  Quantifying Collective Attention from Tweet Stream , 2013, PloS one.

[11]  S. Hamilton,et al.  Attribute Search in Online Retailing , 2017 .

[12]  Yong Li,et al.  Quantifying the Influence of Websites Based on Online Collective Attention Flow , 2015, Journal of Computer Science and Technology.

[13]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[14]  Santo Fortunato,et al.  Ranking web sites with real user traffic , 2008, WSDM '08.

[15]  Duncan J. Watts,et al.  Who says what to whom on twitter , 2011, WWW.

[16]  Jiang Zhang,et al.  Accelerating growth and size-dependent distribution of human online activities. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Gang Zhang,et al.  Quantitative assessment on the cloning efficiencies of lentiviral transfer vectors with a unique clone site , 2012, Scientific Reports.

[18]  J. Geelen ON HOW TO DRAW A GRAPH , 2012 .

[19]  Fang Wu,et al.  Novelty and collective attention , 2007, Proceedings of the National Academy of Sciences.

[20]  Qian Zhang,et al.  Collective attention in the age of (mis)information , 2014, Comput. Hum. Behav..

[21]  Bernardo A. Huberman Crowdsourcing and Attention , 2008, Computer.

[22]  Xiaohan Huang,et al.  Flow Distances on Open Flow Networks , 2015, 1501.06058.

[23]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[24]  Thomas Sandholm,et al.  Collective attention and the dynamics of group deals , 2011, WWW.

[25]  Jiang Zhang,et al.  A Geometric Representation of Collective Attention Flows , 2015, PloS one.

[26]  Esteban Moro,et al.  Social Features of Online Networks: The Strength of Intermediary Ties in Online Social Media , 2011, PloS one.

[27]  Huberman,et al.  Strong regularities in world wide web surfing , 1998, Science.

[28]  Hideyuki Suzuki,et al.  Tracking Time Evolution of Collective Attention Clusters in Twitter: Time Evolving Nonnegative Matrix Factorisation , 2015, PloS one.

[29]  Kyumin Lee,et al.  Combating Threats to Collective Attention in Social Media: An Evaluation , 2013, ICWSM.

[30]  G. Flores,et al.  Recidivism in the child protection system: identifying children at greatest risk of reabuse among those remaining in the home. , 2011, Archives of pediatrics & adolescent medicine.

[31]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[32]  Arindam Banerjee,et al.  Clickstream clustering using weighted longest common subsequences , 2001 .

[33]  James P. Gleeson,et al.  Competition-induced criticality in a model of meme popularity , 2013, Physical review letters.