论文信息 - Web Crawling Algorithms

Web Crawling Algorithms

As the size of the Internet is growing rapidly, it has become important to make the search for content faster and more accurate. Without efficient search engines, it would be impossible to get accurate results. To overcome this problem, software called “Web Crawler” is applied which uses various kinds of algorithms to achieve the goal. These algorithms use various kinds of heuristic functions to increase efficiency of the crawlers. A* and Adaptive A* Search are some of the best path finding algorithms. A* uses a Best-First Search and finds the least-cost path from a given initial node to a goal node. In this work, a study has been done on some of the existing Web Crawler algorithms and A*/Adaptive A* methods have been modified to be used in this domain. A*/Adaptive A* methods being heuristic approaches can be used to find desired results in web-like weighted environments. We create a virtual web environment using graphs and compare the time taken to search the desired node from any random node amongst various web crawling algorithms. KeywordsWeb Crawler; A*; Adaptive A*

Aviral Aviral Nigam | Aviral Nigam

[1] S. Raja,et al. A Survey of Web Crawler Algorithms , 2011 .

[2] Sandeep Sharma,et al. Web-Crawling Approaches in Search Engines , 2008 .

[3] Filippo Menczer,et al. Topical web crawlers: Evaluating adaptive algorithms , 2004, TOIT.

[4] Sven Koenig,et al. Adaptive A , 2005, AAMAS '05.