Characterizing the Citation Graph as a Self-Organizing Networked Information Space

Bodies of information available through the Internet, such as digital libraries and distributed file-sharing systems, often form a self-organizing networked information space, i.e. a collection of interconnected information entities generated incrementally over time by a large number of agents. The collection of electronically available research papers in Computer Science, linked by their citations, form a good example of such a space. In this work we present a study of the structure of the citation graph of computer science literature. Using a web robot we build several citation graphs from parts of the digital library ResearchIndex. After verifying that the degree distributions follow a power law, we apply a series of graph theoretical algorithms to elicit an aggregate picture of the citation graph in terms of its connectivity. The results expand our insight into the structure of self-organizing networked information spaces, and may inform the design of focused crawlers searching such a space for topic-specific information.