Web Crawling Algorithms
暂无分享,去创建一个
As the size of the Internet is growing rapidly, it has become important to make the search for content faster and more accurate. Without efficient search engines, it would be impossible to get accurate results. To overcome this problem, software called “Web Crawler” is applied which uses various kinds of algorithms to achieve the goal. These algorithms use various kinds of heuristic functions to increase efficiency of the crawlers. A* and Adaptive A* Search are some of the best path finding algorithms. A* uses a Best-First Search and finds the least-cost path from a given initial node to a goal node. In this work, a study has been done on some of the existing Web Crawler algorithms and A*/Adaptive A* methods have been modified to be used in this domain. A*/Adaptive A* methods being heuristic approaches can be used to find desired results in web-like weighted environments. We create a virtual web environment using graphs and compare the time taken to search the desired node from any random node amongst various web crawling algorithms. KeywordsWeb Crawler; A*; Adaptive A*
[1] S. Raja,et al. A Survey of Web Crawler Algorithms , 2011 .
[2] Sandeep Sharma,et al. Web-Crawling Approaches in Search Engines , 2008 .
[3] Filippo Menczer,et al. Topical web crawlers: Evaluating adaptive algorithms , 2004, TOIT.
[4] Sven Koenig,et al. Adaptive A , 2005, AAMAS '05.