论文信息 - A New Approach to Design a Domain Specific Web Search Crawler Using Multilevel Domain Classifier

A New Approach to Design a Domain Specific Web Search Crawler Using Multilevel Domain Classifier

Nowadays information published in the internet has become a common knack for all. As a result volume of information has become huge. To handle that huge volume information, Web researchers are introduced various types of search engines. Efficiently Web-page crawling and resource repository building mechanisms are an important part of a search engine. Currently, Web researchers are already introduced various types of Web search crawler mechanism for the various search engines. In this paper, we have introduced a new design and development mechanism of domain-specific Web search crawler, which uses multilevel domain classifiers and crawls multiple domain related Web-pages, uses parallel crawling, etc. Two domain classifiers used to identify domain-specific Web-pages. These two domain classifiers are used one after the other, i.e., two levels. That’s why we are calling this Web search crawler is a multilevel domain-specific Web search crawler.

Debajyoti Mukhopadhyay | Rana Dattagupta | Sukanta Sinha

[1] Carl Lagoze,et al. Focused Crawls, Tunneling, and Digital Libraries , 2002, ECDL.

[2] Marco Gori,et al. Focused Crawling Using Context Graphs , 2000, VLDB.

[3] Anirudha Sahoo,et al. An 802.11 Based MAC Protocol for Providing QoS to Real Time Applications , 2007 .

[4] J. P. Gupta,et al. Parallel crawler architecture and web page change detection , 2008 .

[5] Debajyoti Mukhopadhyay,et al. A New Approach to Design Domain Specific Ontology Based Web Crawler , 2007, 10th International Conference on Information Technology (ICIT 2007).

[6] Stuart Macdonald,et al. User Engagement in Research Data Curation , 2009, ECDL.

[7] N. F. Noy,et al. Ontology Development 101: A Guide to Creating Your First Ontology , 2001 .

[8] Sang Ho Lee,et al. Scrawler: A Seed-By-Seed Parallel Web Crawler , 2007, ICE-B.

[9] Robert Meersman,et al. Data modelling versus ontology engineering , 2002, SGMD.

[10] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.

[11] Walter Willinger,et al. Scaling phenomena in the Internet: Critically examining criticality , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[12] Hector Garcia-Molina,et al. Efficient Crawling Through URL Ordering , 1998, Comput. Networks.

[13] Debajyoti Mukhopadhyay,et al. A New Approach to Design Graph Based Search Engine for Multiple Domains Using Different Ontologies , 2008, 2008 International Conference on Information Technology.

[14] Julie J. Rehmeyer. Mapping a medusa: The internet spreads its tentacles , 2009 .

[15] Robert Meersman,et al. An ontology engineering methodology for DOGMA , 2008 .

[16] Hector Garcia-Molina,et al. Parallel crawlers , 2002, WWW.

[17] Marc Ehrig,et al. Ontology-focused crawling of Web documents , 2003, SAC '03.

[18] Stephen Gilmore,et al. Evaluating the Performance of Skeleton-Based High Level Parallel Programs , 2004, International Conference on Computational Science.

[19] Ling Zhang,et al. A Parallel Crawling Schema Using Dynamic Partition , 2004, International Conference on Computational Science.