Robust and Efficient Annotation based on Ontology Evolution for Deep Web Data

Among those researches in Deep Web, compared to research of data extraction which is more mature, the research of data annotation is still at its preliminary stage. Currently, although the approach of applying ontology in data annotating has been approved by most researchers, there are many weaknesses existed, such as the complexity of the ontology, as well as the limitation on static ontology’s ability to annotate new pages. Respond to those problems, this paper proposes a robust, highly efficient data annotation method based on ontology evolution. It needs to be noticed that this paper defines a simpler ontology which can improve annotating efficiency significantly. Experiments indicate that this method could improve the accuracy and efficiency of data annotation.

[1]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[2]  Felix Naumann,et al.  Schema matching using duplicates , 2005, 21st International Conference on Data Engineering (ICDE'05).

[3]  Erhard Rahm,et al.  Rondo: a programming platform for generic model management , 2003, SIGMOD '03.

[4]  Valter Crescenzi,et al.  Automatic annotation of data extracted from large Web sites , 2003, WebDB.

[5]  Peishen Qi,et al.  Ontology Translation on the Semantic Web , 2003, J. Data Semant..

[6]  Sarit Kraus,et al.  KBFS: K-Best-First Search , 2003, Annals of Mathematics and Artificial Intelligence.

[7]  Clement T. Yu,et al.  An interactive clustering-based approach to integrating source query interfaces on the deep Web , 2004, SIGMOD '04.

[8]  David W. Embley,et al.  Towards Semantic Understanding -- An Approach Based on Information Extraction Ontologies , 2004, ADC.

[9]  Pedro M. Domingos,et al.  Learning Source Descriptions for Data Integration , 2000 .

[10]  Kaizhong Zhang,et al.  Fast Serial and Parallel Algorithms for Approximate Tree Matching with VLDC's , 1992, CPM.

[11]  Chris Clifton,et al.  SEMINT: A tool for identifying attribute correspondences in heterogeneous databases using neural networks , 2000, Data Knowl. Eng..

[12]  Boris Motik,et al.  MAFRA - A MApping FRAmework for Distributed Ontologies , 2002, EKAW.

[13]  Mark A. Musen,et al.  Anchor-PROMPT: Using Non-Local Context for Semantic Matching , 2001, OIS@IJCAI.

[14]  Martin L. Kersten,et al.  A Graph-Oriented Model for Articulation of Ontology Interdependencies , 1999, EDBT.

[15]  Pedro M. Domingos,et al.  Reconciling schemas of disparate data sources: a machine-learning approach , 2001, SIGMOD '01.

[16]  Chris Clifton,et al.  Experience with a Combined Approach to Attribute-Matching Across Heterogeneous Databases , 1997, DS-7.

[17]  Wei-Ying Ma,et al.  Instance-based Schema Matching for Web Databases by Domain-specific Query Probing , 2004, VLDB.

[18]  Nuno Silva,et al.  Ontology Mapping for Interoperability in Semantic Web , 2003, ICWI.

[19]  Silvana Castano,et al.  A schema analysis and reconciliation tool environment for heterogeneous databases , 1999, Proceedings. IDEAS'99. International Database Engineering and Applications Symposium (Cat. No.PR00265).

[20]  Chris Clifton,et al.  Database Integration Using Neural Networks: Implementation and Experiences , 2000, Knowledge and Information Systems.

[21]  Deborah L. McGuinness,et al.  An Environment for Merging and Testing Large Ontologies , 2000, KR.

[22]  Fausto Giunchiglia,et al.  Semantic Matching: Algorithms and Implementation , 2007, J. Data Semant..

[23]  Gunter Saake,et al.  A Sequence-based Ontology Matching Approach , 2008 .

[24]  Wang Hui,et al.  Multi-source Automatic Annotation for Deep Web , 2008, CSSE 2008.

[25]  Xiaofeng Meng,et al.  Automatic Data Extraction from Data-Rich Web Pages , 2005, DASFAA.

[26]  Fausto Giunchiglia,et al.  Semantic Schema Matching , 2005, OTM Conferences.

[27]  Soon Ae Chun,et al.  Semantic deep web: automatic attribute extraction from the deep web data sources , 2007, SAC '07.

[28]  Soon Ae Chun,et al.  Automatic Generation of Ontology from the Deep Web , 2007, 18th International Workshop on Database and Expert Systems Applications (DEXA 2007).

[29]  Nicholas Kushmerick,et al.  Wrapper induction: Efficiency and expressiveness , 2000, Artif. Intell..

[30]  Silvana Castano,et al.  Global Viewing of Heterogeneous Data Sources , 2001, IEEE Trans. Knowl. Data Eng..

[31]  Soon Ae Chun,et al.  Automatic Generation of Ontology from the Deep Web , 2007 .

[32]  Bing Liu,et al.  Web data extraction based on partial tree alignment , 2005, WWW '05.

[33]  Kaizhong Zhang,et al.  Approximate tree pattern matching , 1997 .

[34]  Steffen Staab,et al.  Annotation of the Shallow and the Deep Web , 2003 .

[35]  Erhard Rahm,et al.  COMA - A System for Flexible Combination of Schema Matching Approaches , 2002, VLDB.

[36]  Hui Wang,et al.  Multi-source Automatic Annotation for Deep Web , 2008, 2008 International Conference on Computer Science and Software Engineering.

[37]  Li Zhanhuai,et al.  Ontology-Based Annotation for Deep Web Data , 2008 .

[38]  Jérôme Euzenat,et al.  Similarity-Based Ontology Alignment in OWL-Lite , 2004, ECAI.

[39]  Erhard Rahm,et al.  Generic Schema Matching with Cupid , 2001, VLDB.

[40]  Philip A. Bernstein,et al.  Industrial-strength schema matching , 2004, SGMD.

[41]  Barbara Lerner,et al.  A model for compound type changes encountered in schema evolution , 2000, TODS.

[42]  Fausto Giunchiglia,et al.  S-Match: an Algorithm and an Implementation of Semantic Matching , 2004, ESWS.

[43]  David W. Embley,et al.  Using Data-Extraction Ontologies to Foster Automating Semantic Annotation , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[44]  Avigdor Gal,et al.  Automatic Ontology Matching Using Application Semantics , 2005, AI Mag..

[45]  Erhard Rahm,et al.  Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.