Experiments on the automatic construction of hypertexts from texts

Abstract The problem of (semi-)automatically turning text into hypertext is one that has been identified as important to the growth and development of hypertext as a way of organising information. In this paper we describe an approach we have developed to semi-automatically generate a hypertext from linear texts. This is based on initially creating nodes and composite nodes composed of ‘mini-hypertexts’. Following this we then compute node-node similarity values using standard information retrieval techniques. These similarity measures are then used to selectively create node-node links based on the strength of similarity between nodes. What makes our process novel is that the link creation process also uses values from a dynamically computed metric which measures the topological compactness of the overall hypertext being generated. Thus link creation is a selective process based not only on node-node similarity but also on the overall layout of the hypertext. Experiments on generating a hypertext from a ...

[1]  Roy Rada,et al.  Converting a textbook to hypertext , 1992, TOIS.

[2]  Mark H. Chignell,et al.  The HEFTI Model of Text to Hypertext Conversion , 1991, Hypermedia.

[3]  R. Raymond Darrell,et al.  Hypertext and the Oxford English dictionary , 1988 .

[4]  Alan F. Smeaton,et al.  Progress in the Application of Natural Language Processing to Information Retrieval Tasks , 1992, Comput. J..

[5]  Gerard Salton,et al.  On the Automatic Generation of Content Links in Hypertext , 1989 .

[6]  G. Halasz Frank,et al.  Reflections on NoteCards: seven issues for the next generation of hypermedia systems , 1987, CACM.

[7]  Jacques Savoy Effectiveness of Information Retrieval Systems Used in a Hypertext Environment , 1993, Hypermedia.

[8]  Frank Wm. Tompa,et al.  Hypertext and the Oxford English dictionary , 1988, CACM.

[9]  Ben Shneiderman,et al.  Structural analysis of hypertexts: identifying hierarchies and useful metrics , 1992, TOIS.

[10]  G Salton,et al.  Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts , 1994, Science.

[11]  W. Bruce Croft,et al.  A retrieval model incorporating hypertext links , 1989, Hypertext.

[12]  Nicholas J. Belkin,et al.  Information filtering and information retrieval: two sides of the same coin? , 1992, CACM.

[13]  James H. Coombs Hypertext, full text, and automatic linking , 1989, SIGIR '90.

[14]  James Allan,et al.  Automatic Hypertext Construction , 1995 .

[15]  Alan F. Smeaton,et al.  Information retrieval from hypertext using dynamically planned guided tours , 1993, ECHT '92.

[16]  Alistair Moffat,et al.  Retrieval of Partial Documents , 1993, TREC.

[17]  E. Frisse Mark,et al.  Searching for information in a hypertext medical handbook , 1988 .

[18]  Rodrigo A. Botafogo Cluster analysis for hypertext systems , 1993, SIGIR.

[19]  Hans-Peter Frei,et al.  Making use of hypertext links when retrieving information , 1992, ECHT '92.

[20]  Jakob Nielsen,et al.  The matters that really matter for hypertext usability , 1989, Hypertext.

[21]  Frank G. Halasz,et al.  Reflections on NoteCards: seven issues for the next generation of hypermedia systems , 1987, Hypertext.

[22]  Christian Plaunt,et al.  Subtopic structuring for full-length document access , 1993, SIGIR.

[23]  Fred Karlsson,et al.  Constraint Grammar as a Framework for Parsing Running Text , 1990, COLING.

[24]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[25]  Peter Schäuble,et al.  Document and passage retrieval based on hidden Markov models , 1994, SIGIR '94.

[26]  Forbes Gibb,et al.  Structured Information Management Using New Techniques for Processing Text. , 1990 .

[27]  Jakob Nielsen,et al.  Hypertext and hypermedia , 1990 .