Exploiting the deep web with DynaBot: matching, probing, and ranking

We present the design of Dynabot, a guided Deep Web discovery system. Dynabot's modular architecture supports focused crawling of the Deep Web with an emphasis on matching, probing, and ranking discovered sources using two key components: service class descriptions and source-biased analysis. We describe the overall architecture of Dynabot and discuss how these components support effective exploitation of the massive Deep Web data available.