Bodies of information available through the Internet, such as digital libraries and distributed file-sharing systems, often form a self-organizing networked information space, i.e. a collection of interconnected information entities generated incrementally over time by a large number of agents. The collection of electronically available research papers in Computer Science, linked by their citations, form a good example of such a space. In this work we present a study of the structure of the citation graph of computer science literature. Using a web robot we build several citation graphs from parts of the digital library ResearchIndex. After verifying that the degree distributions follow a power law, we apply a series of graph theoretical algorithms to elicit an aggregate picture of the citation graph in terms of its connectivity. The results expand our insight into the structure of self-organizing networked information spaces, and may inform the design of focused crawlers searching such a space for topic-specific information.
[1]
Albert,et al.
Emergence of scaling in random networks
,
1999,
Science.
[2]
S. Redner.
How popular is your paper? An empirical study of the citation distribution
,
1998,
cond-mat/9804163.
[3]
Prabhakar Raghavan,et al.
Mining the Link Structure of the World Wide Web
,
1998
.
[4]
E. Garfield.
Citation analysis as a tool in journal evaluation.
,
1972,
Science.
[5]
Chaomei Chen,et al.
Visualising Semantic Spaces and Author Co-Citation Networks in Digital Libraries
,
1999,
Inf. Process. Manag..
[6]
Ravi Kumar,et al.
Self-similarity in the web
,
2001,
TOIT.
[7]
Andrei Z. Broder,et al.
Graph structure in the Web
,
2000,
Comput. Networks.
[8]
Jon M. Kleinberg,et al.
Mining the Web's Link Structure
,
1999,
Computer.
[9]
Ray J. Paul,et al.
Visualizing a Knowledge Domain's Intellectual Structure
,
2001,
Computer.