论文信息 - Increasing scalability of researcher network extraction from the web

Increasing scalability of researcher network extraction from the web

Social networks, which describe relations among people or organizations as a network, have recently attracted attention. With the help of a social network, we can analyze the structure of a community and thereby promote efficient communications within it. We investigate the problem of extracting a network of researchers from the Web, to assist efficient cooperation among researchers. Our method uses a search engine to get the cooccurences of names of two researchers and calculates the streangth of the relation between them. Then we label the relation by analyzing the Web pages in which these two names cooccur. Research on social network extraction using search engines as ours, is attracting attention in Japan as well as abroad. However, the former approaches issue too many queries to search engines to extract a large-scale network. In this paper, we propose a method to filter superfluous queries and facilitates the extraction of large-scale networks. By this method we are able to extract a network of around 3000-nodes. Our experimental results show that the proposed method reduces the number of queries significantly while preserving the quality of the network as compared to former methods.

[1] Peter D. Turney. Coherent Keyphrase Extraction via Web Mining , 2003, IJCAI.

[2] H. Ogata,et al. SocialPathFinder : Computer Supported Exploration of Social Networks on WWW , 1999 .

[3] Bart Selman,et al. The Hidden Web , 1997, AI Mag..

[4] Hideyuki Nakashima,et al. Social Network Extraction from the Web information , 2005 .

[5] Ted Dunning,et al. Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.