Technical Report : Semantic Annotation Platforms

Semantic annotation is a key component for the realization of the Semantic Web. The volume of existing and new documents on the Web makes manual annotation problematic. Semi-automatic methods have been designed to alleviate the burden, and these methods have begun to be implemented with Semantic Annotation Platforms (SAPs). SAPs provide services that support annotation, including ontologies, knowledge bases, information extraction methods, APIs, and user interfaces. This chapter examines the considerations annotation systems must take into account, such document structure and initial ontology, and provides an overview of current SAP implementations. The Semantic Web also results in new as well as extended applications, such as concept searching, custom web page generation, question-answering systems as well as visualization. These applications are all made possible by the semi-automatic annotation services provided by SAPs.

[1]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[2]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[3]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[4]  Mark Craven,et al.  Representing Sentence Structure in Hidden Markov Models for Information Extraction , 2001, IJCAI.

[5]  Steffen Staab,et al.  S-CREAM: Semiautomatic CREAtion of Metadata , 2002, SAAKM@ECAI.

[6]  Alexiei Dingli,et al.  Automatic semantic annotation using unsupervised information extraction and integration , 2003 .

[7]  Atanas Kiryakov,et al.  Semantic Annotation, Indexing, and Retrieval , 2003, SEMWEB.

[8]  Ramanathan V. Guha,et al.  SemTag and seeker: bootstrapping the semantic web via automated semantic annotation , 2003, WWW '03.

[9]  Nicholas Kushmerick,et al.  Wrapper Induction for Information Extraction , 1997, IJCAI.

[10]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[11]  Georgios Paliouras,et al.  A Methodology for Semantically Annotating a Corpus Using a Domain Ontology and Machine Learning , 2003 .

[12]  Sunita Sarawagi,et al.  Automatic segmentation of text into structured records , 2001, SIGMOD '01.

[13]  Fabio Ciravegna,et al.  Adaptive Information Extraction from Text by Rule Induction and Generalisation , 2001, IJCAI.

[14]  Mark Craven,et al.  Hierarchical Hidden Markov Models for Information Extraction , 2003, IJCAI.

[15]  Marcelo Tallis,et al.  Semantic Word Processing for Content Authors , 2003 .

[16]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[17]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[18]  Arthur Stutt,et al.  MnM: Ontology Driven Semi-automatic and Automatic Support for Semantic Markup , 2002, EKAW.

[19]  Sergey Brin,et al.  Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[20]  Carole A. Goble,et al.  Screen Readers Cannot See Ontology Based Semantic Annotation for Visually Impaired Web Travellers , 2004 .

[21]  Ulrike Gut,et al.  Methodology for Reliable Schema Development and Evaluation of Manual Annotations , 2003 .

[22]  Diana Maynard,et al.  JAPE: a Java Annotation Patterns Engine , 2000 .

[23]  Steffen Staab,et al.  Towards the self-annotating web , 2004, WWW '04.

[24]  Hamish Cunningham,et al.  GATE-a General Architecture for Text Engineering , 1996, COLING.

[25]  Paul A. Kogut,et al.  AeroDAML: Applying Information Extraction to Generate DAML Annotations from Web Pages , 2001, Semannot@K-CAP 2001.

[26]  Paola Velardi,et al.  The Usable Ontology: An Environment for Building and Assessing a Domain Ontology , 2002, SEMWEB.

[27]  I. V. Ramakrishnan,et al.  Automatic Annotation of Content-Rich HTML Documents: Structural and Semantic Analysis , 2003, SEMWEB.

[28]  Balakrishnan Chandrasekaran,et al.  What are ontologies, and why do we need them? , 1999, IEEE Intell. Syst..

[29]  Jun Yang,et al.  AUTOBIB: automatic extraction of bibliographic information on the Web , 2004, Proceedings. International Database Engineering and Applications Symposium, 2004. IDEAS '04..

[30]  Martin Labský,et al.  RDF-Based Retrieval of Information Extracted from Web Product Catalogues , 2004 .