Automatic text decomposition using text segments and text themes

With the widespread use of full-text information retrieval, passage-retrieval techniques are becoming increasingly popular. Larger texts can then be replaced by important text excerpts, thereby simplifying the retrieval task and improving retrieval effectiveness. Passage-level evidence about the use of words in local contexts is also useful for resolving language ambiguities and improving retrieval output. Two main text decomposition strategies are introduced in this study, including a chronological decomposition into {\em text segments}, and semantic decomposition into {\em text themes}. The interaction between text segments and text themes is then used to characterize text structure, and to formulate specifications for information retrieval, text traversal, and text summarization.

[1]  Mark Weiser,et al.  TEXTNET: a network-based approach to text handling , 1986, TOIS.

[2]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[3]  G Salton,et al.  Global Text Matching for Information Retrieval , 1991, Science.

[4]  Peter A. Gloor CYBERMAP: yet another way of navigating in hyperspace , 1991, HYPERTEXT '91.

[5]  Ben Shneiderman,et al.  Structural analysis of hypertexts: identifying hierarchies and useful metrics , 1992, TOIS.

[6]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[7]  James Allan,et al.  Selective text utilization and text traversal , 1993, Int. J. Hum. Comput. Stud..

[8]  James Allan,et al.  Approaches to passage retrieval in full text information systems , 1993, SIGIR.

[9]  M. M. Subbotin,et al.  Russian experience in hypertext: Automatic compiling of coherent texts , 1993 .

[10]  Christian Plaunt,et al.  Subtopic structuring for full-length document access , 1993, SIGIR.

[11]  R. S. Gilyarevskii,et al.  Russian Experience in Hypertext: Automatic Compiling of Coherent Texts , 1993, J. Am. Soc. Inf. Sci..

[12]  Ross Wilkinson,et al.  Effective retrieval of structured documents , 1994, SIGIR '94.

[13]  Gerard Salton,et al.  Automatic Text Theme Generation and the Analysis of Text Structure , 1994 .

[14]  Peter Schäuble,et al.  Improving a Basic Retrieval Method by Links and Passage Level Evidence , 1994, TREC.

[15]  James P. Callan,et al.  Passage-level evidence in document retrieval , 1994, SIGIR '94.

[16]  G Salton,et al.  Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts , 1994, Science.

[17]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.