论文信息 - Characterizing the Citation Graph as a Self-Organizing Networked Information Space

Characterizing the Citation Graph as a Self-Organizing Networked Information Space

Bodies of information available through the Internet, such as digital libraries and distributed file-sharing systems, often form a self-organizing networked information space, i.e. a collection of interconnected information entities generated incrementally over time by a large number of agents. The collection of electronically available research papers in Computer Science, linked by their citations, form a good example of such a space. In this work we present a study of the structure of the citation graph of computer science literature. Using a web robot we build several citation graphs from parts of the digital library ResearchIndex. After verifying that the degree distributions follow a power law, we apply a series of graph theoretical algorithms to elicit an aggregate picture of the citation graph in terms of its connectivity. The results expand our insight into the structure of self-organizing networked information spaces, and may inform the design of focused crawlers searching such a space for topic-specific information.

Yuan An | Evangelos E. Milios | Jeannette C. M. Janssen

[1] Albert,et al. Emergence of scaling in random networks , 1999, Science.

[2] S. Redner. How popular is your paper? An empirical study of the citation distribution , 1998, cond-mat/9804163.

[3] Prabhakar Raghavan,et al. Mining the Link Structure of the World Wide Web , 1998 .

[4] E. Garfield. Citation analysis as a tool in journal evaluation. , 1972, Science.

[5] Chaomei Chen,et al. Visualising Semantic Spaces and Author Co-Citation Networks in Digital Libraries , 1999, Inf. Process. Manag..

[6] Ravi Kumar,et al. Self-similarity in the web , 2001, TOIT.

[7] Andrei Z. Broder,et al. Graph structure in the Web , 2000, Comput. Networks.

[8] Jon M. Kleinberg,et al. Mining the Web's Link Structure , 1999, Computer.

[9] Ray J. Paul,et al. Visualizing a Knowledge Domain's Intellectual Structure , 2001, Computer.