Domain and Keyword Specific Data Extraction from Invisible Web Databases

Web information access today primarily relies on search engines. Current search engines cannot make index to the pages which are generated automatically by the back -- end databases called invisible web or deep web. The information is hidden behind HTML forms and is only available in response to user's request. In this paper a system based on domain and keyword specific information extraction is described.

[1]  Debajyoti Mukhopadhyay,et al.  A New Approach to Design Domain Specific Ontology Based Web Crawler , 2007, 10th International Conference on Information Technology (ICIT 2007).

[2]  Mitesh Patel,et al.  Structured databases on the web: observations and implications , 2004, SGMD.

[3]  B. Huberman,et al.  The Deep Web : Surfacing Hidden Value , 2000 .

[4]  Sriram Raghavan,et al.  Crawling the Hidden Web , 2001, VLDB.

[5]  Peiguang Lin,et al.  Finding the WDB's Query Interface in Deep Web Automatically , 2008, 2008 International Conference on Internet Computing in Science and Engineering.

[6]  Juliana Freire,et al.  Searching for Hidden-Web Databases , 2005, WebDB.

[7]  Yun-Fa Hu,et al.  A Method of Deep Web Classification , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[8]  Juliana Freire,et al.  An adaptive crawler for locating hidden-Web entry points , 2007, WWW '07.