论文信息 - GLOSSING THE INFORMATION FROM DISTRIBUTED DATABASES

GLOSSING THE INFORMATION FROM DISTRIBUTED DATABASES

Internet provides huge amount of useful information which is align into a format for users. Here we observe the difficulty for extraction of relevant data from different sources. Relevant data transform into structured format. Structured format contains only necessary information. The motivation behind in the system provides the compressed results which are meaningful based on concept and category. Different applications store the information in huge databases. Users are access the information from web databases based on concept wise. Single dimension concept based results are not meaningful. Meaningless records are aligning into web interfaces. In this paper we propose to extract the records with two dimensions. Those dimensions are concept and category. Using these two dimensions we organize the records into a structured format and provide the meaningful results to the users. Compare to concept we get the better results with concept and category dimensions.

[1] David W. Embley,et al. Conceptual-Model-Based Data Extraction from Multiple-Record Web Pages , 1999, Data Knowl. Eng..

[2] Tobias Dönz. Extracting Structured Data from Web Pages , 2003 .

[3] Dimitrios Skoutas,et al. STAVIES: a system for information extraction from unknown Web data sources through automatic Web wrapper generation using clustering techniques , 2005, IEEE Transactions on Knowledge and Data Engineering.

[4] Jayant Madhavan,et al. Harvesting relational tables from lists on the web , 2009, The VLDB Journal.

[5] David E. Goldberg,et al. Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[6] Salvatore J. Stolfo,et al. Experiments on multistrategy learning by meta-learning , 1993, CIKM '93.

[7] Ramanathan V. Guha,et al. SemTag and seeker: bootstrapping the semantic web via automated semantic annotation , 2003, WWW '03.

[8] Valter Crescenzi,et al. RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.

[9] Clement T. Yu,et al. Annotating Structured Data of the Deep Web , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[10] Dayne Freitag,et al. Multistrategy Learning for Information Extraction , 1998, ICML.

[11] Clement T. Yu,et al. Annotating Search Results from Web Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[12] W. Bruce Croft. Combining Approaches to Information Retrieval , 2002 .

[13] Valter Crescenzi,et al. Automatic annotation of data extracted from large Web sites , 2003, WebDB.

[14] Wei Liu,et al. ViDE: A Vision-Based Approach for Deep Web Data Extraction , 2010, IEEE Transactions on Knowledge and Data Engineering.