This paper presents an algorithm for text summarization using the thematic hierarchy of a text. The algorithm is intended to generate a one-page summary for the user, thereby enabling the user to skim large volumes of an electronic book on a computer display. The algorithm first detects the thematic hierarchy of a source text with lexical cohesion measured by term repetitions. Then, it identifies boundary sentences at which a topic of appropriate grading probably starts. Finally, it generates a structured summary indicating the outline of the thematic hierarchy. This paper mainly describes and evaluates the part for boundary sentence identification in the algorithm, and then briefly discusses the readability of one-page summaries.
[1]
Shmuel T. Klein,et al.
Clumping properties of content-bearing words
,
1998
.
[2]
Yoshio Nakao.
Thematic Hierarchy Detection of a Text using Lexical Cohesion
,
1999
.
[3]
Marti A. Hearst.
Multi-Paragraph Segmentation Expository Text
,
1994,
ACL.
[4]
Michael Halliday,et al.
Cohesion in English
,
1976
.
[5]
Gerard Salton,et al.
Automatic text decomposition using text segments and text themes
,
1996,
HYPERTEXT '96.
[6]
Yaakov Yaari,et al.
Texplore- exploring expository texts via hierarchical representation
,
1998
.