A Dynamically Reconfigurable Model for a Distributed Web Crawling System
暂无分享,去创建一个
A web crawling system using a distributed architecture needs to coordinate the whole system when the nodes in the system change. This paper presents an efficiently dynamic reconfigurability model that can be used in such a system. Through analyzing the model, we got methods to achieve the optimized performance in the distributed web crawling system, i.e., retain load balance and produce low network traffic in the system. Currently this dynamic reconfigurability model is being introduced in perfecting WebGather, a well-known Chinese and English web search engine. In addition, we believe that the model can also be useful in other web crawling system adopting a distributed architecture.
[1] Hongfei Yan,et al. Architectural design and evaluation of an efficient Web-crawling system , 2002, J. Syst. Softw..
[2] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.
[3] Ming Lei,et al. Digging for gold on the Web: experience with the WebGather , 2000, Proceedings Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region.