论文信息 - Topic segmentation using word-level semantic relatedness functions

Topic segmentation using word-level semantic relatedness functions

Semantic relatedness deals with the problem of measuring how much two words are related to each other. While there is a large body of research for developing new measures, the use of semantic relatedness (SR) measures in topic segmentation has not been explored. In this research the performance of different SR measures is evaluated in the topic segmentation problem. To this end, two topic segmentation algorithms that use the difference in SR of words are introduced. Our results indicate that using an SR measure trained with a general domain corpora achieves better results than topic segmentation algorithms using Wordnet or simple word repetition. Furthermore, when compared with computationally more complex algorithms performing global analysis, our local analysis, enhanced with general domain lexical semantic information, achieves comparable results.

Ilyas Cicekli | Gonenc Ercan

[1] Patrick Pantel,et al. From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[2] John B. Goodenough,et al. Contextual correlates of synonymy , 1965, CACM.

[3] Joemon M. Jose,et al. Text segmentation: A topic modeling perspective , 2011, Inf. Process. Manag..

[4] Curt Burgess,et al. Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[5] Michael Halliday,et al. Cohesion in English , 1976 .

[6] John D. Lafferty,et al. Statistical Models for Text Segmentation , 1999, Machine Learning.

[7] Philip Resnik,et al. Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[8] Christiane Fellbaum,et al. Combining Local Context and Wordnet Similarity for Word Sense Identification , 1998 .

[9] Martha Palmer,et al. Verb Semantics and Lexical Selection , 1994, ACL.

[10] Graeme Hirst,et al. Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[11] Yves Bestgen,et al. Squibs and Discussions: Improving Text Segmentation Using Latent Semantic Analysis: A Reanalysis of Choi, Wiemer-Hastings, and Moore (2001) , 2006, CL.