论文信息 - Using contextual semantics to automate the Web document search and analysis

Using contextual semantics to automate the Web document search and analysis

Traditional information retrieval techniques require documents that share enough words to build semantic links between them. This kind of technique is greatly affected by two factors: synonymy (different words having the same meaning) and polysemy (a word with several meanings), also known as ambiguity. Synonymy may result in a loss of semantic difference, while polysemy may lead to wrong semantic links. S.J. Green (1999) proposed the concept of a synset (a set of words having the same or a close meaning) and used a synset method to solve the problems of synonymy and polysemy. Although the synonymy problem can be solved, the polysemy problem still remains, because it is not actually possible to use an entire document as a basis to identify the meaning of a word. In this paper, we propose the concept of a context-related semantic set in order to identify the meaning of a word by considering the relations between the word and its contexts. We believe that this approach can efficiently solve the ambiguity problem and hence support the automation of Web document searching and analysis.

[1] James Allan,et al. Automatic hypertext link typing , 1996 .

[2] Stephen J. Green,et al. Building Hypertext Links By Computing Semantic Similarity , 1999, IEEE Trans. Knowl. Data Eng..

[3] Fabio Crestani,et al. On the Use of Information Retrieval Techniques for the Automatic Construction of Hypertext , 1997, Inf. Process. Manag..

[4] James Allan. Building Hypertext Using Information Retrieval , 1997, Inf. Process. Manag..

[5] Graeme Hirst,et al. Lexical Cohesion Computed by Thesaural relations as an indicator of the structure of text , 1991, CL.

[6] Paul B. Thistlewaite. Automatic Construction and Management of Large Open Webs , 1997, Inf. Process. Manag..