Networks describe various complex natural systems including social systems. We investigate the social network of co-occurrence in Reuters-21578 corpus, which consists of news articles that appeared in the Reuters newswire in 1987. People are represented as vertices and two persons are connected if they co-occur in the same article. The network has small-world features with power-law degree distribution. The network is disconnected and the component size distribution has power-law characteristics. Community detection on a degree-reduced network provides meaningful communities. An edge-reduced network, which contains only the strong ties has a star topology. "Importance" of persons are investigated. The network is the situation in 1987. After 20 years, a better judgment on the importance of the people can be done. A number of ranking algorithms, including Citation count and PageRank, are used to assign ranks to vertices. The ranks given by the algorithms are compared against how well a person is represented in Wikipedia. We find up to medium level Spearman's rank correlations. A noteworthy finding is that PageRank consistently performed worse than the other algorithms. We analyze this further and find reasons.
[1]
L. Freeman.
Centrality in social networks conceptual clarification
,
1978
.
[2]
M E J Newman,et al.
Community structure in social and biological networks
,
2001,
Proceedings of the National Academy of Sciences of the United States of America.
[3]
J. Meigs,et al.
WHO Technical Report
,
1954,
The Yale Journal of Biology and Medicine.
[4]
Duncan J. Watts,et al.
Collective dynamics of ‘small-world’ networks
,
1998,
Nature.
[5]
Mark E. J. Newman,et al.
The Structure and Function of Complex Networks
,
2003,
SIAM Rev..
[6]
L. Amaral,et al.
The web of human sexual contacts
,
2001,
Nature.
[7]
A. Barabasi,et al.
Evolution of the social network of scientific collaborations
,
2001,
cond-mat/0104162.