Automatic Creation and Simplified Querying of Semantic Web Content: An Approach Based on Information-Extraction Ontologies

The semantic web represents a major advance in web utility, but it is currently difficult to create semantic-web content because pages must be semantically annotated through processes that are mostly manual and require a high degree of engineering skill Furthermore, users need an effective way to query the semantic web, but any burden placed on users to learn a query language is unlikely to garner sufficient user support and interest Unfortunately, both the creation and use of semantic-web pages are difficult, and these are precisely the processes that must be made simple in order for the semantic web to truly succeed We propose using information-extraction ontologies to handle both of these challenges In this paper we show how a successful ontology-based data-extraction technique can (1) automatically generate semantic annotations for ordinary web pages, and (2) support free-form, textual queries that will be relatively simple for end users to write.

[1]  Valter Crescenzi,et al.  Automatic annotation of data extracted from large Web sites , 2003, WebDB.

[2]  David W. Embley,et al.  A scheme-driven natural language query translator , 1985, CSC '85.

[3]  Atanas Kiryakov,et al.  Semantic annotation, indexing, and retrieval , 2004, J. Web Semant..

[4]  Marcel Worring,et al.  Adding Spatial Semantics to Image Annotations , 2004, LSTKM@EKAW.

[5]  Mark S. Vickers,et al.  Ontology-Based Free-Form Query Processing for the Semantic Web , 2006 .

[6]  Cui Tao,et al.  Automating the extraction of data from HTML tables with unknown structure , 2005, Data Knowl. Eng..

[7]  David W. Embley,et al.  Query Rewriting for Extracting Data Behind HTML Forms , 2004, ER.

[8]  Valter Crescenzi,et al.  RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.

[9]  David W. Embley,et al.  A Generalized Framework for an Ontology-Based Data-Extraction System , 2005, ISTA.

[10]  Ramanathan V. Guha,et al.  A case for automated large-scale semantic annotation , 2003, J. Web Semant..

[11]  David W. Embley,et al.  Record Location and Reconfiguration in Unstructured Multiple-Record Web Documents , 2000, WebDB.

[12]  Steffen Staab,et al.  S-CREAM: Semiautomatic CREAtion of Metadata , 2002, SAAKM@ECAI.

[13]  David W. Embley,et al.  Conceptual-Model-Based Data Extraction from Multiple-Record Web Pages , 1999, Data Knowl. Eng..

[14]  David W. Embley,et al.  Record-boundary discovery in Web documents , 1999, SIGMOD '99.

[15]  Craig A. Knoblock,et al.  Wrapper Maintenance: A Machine Learning Approach , 2011, J. Artif. Intell. Res..

[16]  I. V. Ramakrishnan,et al.  Automatic Annotation of Content-Rich HTML Documents: Structural and Semantic Analysis , 2003, SEMWEB.

[17]  David E. Millard,et al.  Automatic Ontology-Based Knowledge Extraction from Web Documents , 2003, IEEE Intell. Syst..

[18]  I. V. Ramakrishnan,et al.  Computational aspects of resilient data extraction from semistructured sources (extended abstract) , 2000, PODS '00.

[19]  Amit P. Sheth,et al.  Managing Semantic Content for the Web , 2002, IEEE Internet Comput..

[20]  Abraham Bernstein,et al.  Talking to the Semantic Web - A Controlled English Query Interface for Ontologies* , 2004 .

[21]  Tim Furche,et al.  Querying the Web Reconsidered: Design Principles for Versatile Web Query Languages , 2005, Int. J. Semantic Web Inf. Syst..

[22]  David Maier,et al.  The Theory of Relational Databases , 1983 .

[23]  David W. Embley,et al.  Recognizing Ontology-Applicable Multiple-Record Web Documents , 2001, ER.

[24]  Berthier A. Ribeiro-Neto,et al.  A brief survey of web data extraction tools , 2002, SGMD.

[25]  David W. Embley,et al.  A composite approach to automating direct and indirect schema mappings , 2006, Inf. Syst..

[26]  Scott Boag,et al.  XQuery 1.0 : An XML Query Language , 2007 .

[27]  Steffen Staab,et al.  Bootstrapping an Ontology-Based Information Extraction System , 2003, Intelligent Exploration of the Web.

[28]  Amit P. Sheth,et al.  Semantic (Web) Technology In Action: Ontology Driven Information Systems for Search, Integration and Analysis , 2003, IEEE Data Eng. Bull..

[29]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993 .

[30]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[31]  Arthur Stutt,et al.  MnM: Ontology-Driven Tool for Semantic Markup , 2002, SAAKM@ECAI.

[32]  David W. Embley,et al.  Object-oriented systems analysis - a model-driven approach , 1991, Yourdon Press Computing series.

[33]  Marja-Riitta Koivunen,et al.  Annotea: an open RDF infrastructure for shared Web annotations , 2001, WWW '01.

[34]  Atanas Kiryakov,et al.  Semantic Annotation, Indexing, and Retrieval , 2003, SEMWEB.

[35]  Schubert Foo,et al.  Ontology research and development. Part 1 - a review of ontology generation , 2002, J. Inf. Sci..

[36]  Werner Ceusters,et al.  Using ontology in query answering systems: Scenarios, requirements and challenges , 2003 .

[37]  James A. Hendler,et al.  Dynamic Ontologies on the Web , 2000, AAAI/IAAI.