Word semantics for information retrieval: moving one step closer to the Semantic Web

The goal of the Semantic Web is to create a new form of Web content meaningful to computers. The Semantic Web aims to provide greater functionality, via intelligent tools such as information extractors, brokers, reasoning services or question answering systems. Semantics can be addressed at several levels. In this paper, we focus on the lowest level-word semantics on which other higher levels such as concept, paragraph, or document levels can be based upon. This model, which we call Word Semantics (WS), does not include the rich set of tags proposed by the XML/RDF standards. Nevertheless, this simpler WS format comes with a big advantage: it is possible with existing technologies and resources. Practically, this new model relies on understanding word meanings, identifying important named entities such as person, organization and others, and linking all this information via an external general purpose ontology, namely WordNet. With these features, we regard the WS model as a short but strong step toward the long term goal of a Semantic Web.

[1]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[2]  Rada Mihalcea,et al.  Semantic Indexing using WordNet Senses , 2000 .

[3]  James A. Hendler,et al.  The semantic Web and its languages , 2000 .

[4]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[5]  Dekang Lin,et al.  PRINCIPAR - An Efficient, Broad-coverage, Principle-based Parser , 1994, COLING.

[6]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[7]  C. Lee Giles,et al.  Sequence learning: from recognition and prediction to sequential decision making , 2001, IEEE Intelligent Systems.

[8]  James A. Hendler,et al.  A Portrait of the Semantic Web in Action , 2001, IEEE Intell. Syst..

[9]  Ian Horrocks,et al.  OIL: An Ontology Infrastructure for the Semantic Web , 2001, IEEE Intell. Syst..

[10]  Rada Mihalcea,et al.  An Iterative Approach to Word Sense Disambiguation , 2000, FLAIRS.

[11]  Dan Brickley,et al.  Resource Description Framework (RDF) Model and Syntax Specification , 2002 .

[12]  Vasile Rus,et al.  Logic Form Transformation of WordNet and its Applicability to Question Answering , 2001, ACL.

[13]  George A. Miller,et al.  WordNet 2 - A Morphologically and Semantically Enhanced Resource , 1999 .

[14]  James A. Hendler,et al.  Agents and the Semantic Web , 2001, IEEE Intell. Syst..

[15]  Ian Horrocks,et al.  The Semantic Web: The Roles of XML and RDF , 2000, IEEE Internet Comput..

[16]  Rada Mihalcea,et al.  Using WordNet and Lexical Operators to Improve Internet Searches , 2000, IEEE Internet Comput..

[17]  T. J. Bergendahl,et al.  DIGITAL EQUIPMENT CORPORATION. , 1968, Analytical chemistry.