Enhancement of text representations using related document titles

Various attempts have been made over the years to construct enhanced document representations by using thesauruses of related terms, term association maps, or knowledge frameworks that can be used to extract appropriate terms and concepts. None of the proposed methods for the improvement of document representation has proved to be generally useful when applied to a variety of different retrieval environments. Some recent work by Kwok suggests that document indexing may be enhanced by using title words taken from bibliographically related items. An evaluation of the process shows that many useful content words can be extracted from related document titles, as well as many terms of doubtful value. Overall, the procedure is not sufficiently reliable to warrant incorporation into operational automatic retrieval systems.

[1]  Paul H. Klingbiel Machine-aided indexing of technical literature , 1973, Inf. Storage Retr..

[2]  M. M. Kessler Bibliographic coupling between scientific papers , 1963 .

[3]  J. Westbrook Identifying Significant Research. , 1960, Science.

[4]  E. Garfield,et al.  Citation indexes for science. , 1956, Science.

[5]  Henry G. Small,et al.  Co-citation in the scientific literature: A new measure of the relationship between two documents , 1973, J. Am. Soc. Inf. Sci..

[6]  Gerard Salton,et al.  Experiments in Automatic Thesaurus Construction for Information Retrieval , 1971, IFIP Congress.

[7]  Kui-Lam Kwok The use of title and cited titles as document representation for automatic classification , 1975, Inf. Process. Manag..

[8]  Gerard Salton,et al.  AUTOMATIC INDEXING USING BIBLIOGRAPHIC CITATIONS , 1971 .

[9]  Marvin Minsky,et al.  A framework for representing knowledge , 1974 .

[10]  Ronald J. Brachman,et al.  Special issue on knowledge representation , 1980, SGAR.

[11]  Cyril W. Cleverdon,et al.  Factors determining the performance of indexing systems , 1966 .

[12]  Lauren B. Doyle,et al.  Semantic Road Maps for Literature Searchers , 1961, JACM.

[13]  Kui-Lam Kwok,et al.  A probabilistic theory of indexing and similarity measure based on cited and citing documents , 1985, J. Am. Soc. Inf. Sci..

[14]  Julie Bichteler,et al.  The combined use of bibliographic coupling and cocitation for document retrieval , 1980, J. Am. Soc. Inf. Sci..

[15]  Bruce W. Ballard,et al.  LDC-1: a transportable, knowledge-based natural language processor for office environments , 1984, TOIS.

[16]  Van Rijsbergen,et al.  A theoretical basis for the use of co-occurence data in information retrieval , 1977 .

[17]  G. Salton,et al.  A Generalized Term Dependence Model in Information Retrieval , 1983 .

[18]  M. M. Kessler Comparison of the results of bibliographic coupling and analytic subject indexing , 1965 .

[19]  Martin Dillon,et al.  FASIT: A fully automatic syntactically based indexing system , 1983, J. Am. Soc. Inf. Sci..

[20]  Michael Lesk,et al.  Word-word associations in document retrieval systems , 1969 .

[21]  Dagobert Soergel,et al.  Indexing languages and thesauri : construction and maintenance , 1974 .

[22]  Kui-Lam Kwok,et al.  A Document-Document Similarity Measure Based on Cited Titles and Probability Theory, and Its Application to Relevance Feedback Retrieval , 1984, SIGIR.

[23]  Paul H. Klingbiel A technique for machine-aided indexing , 1973, Inf. Storage Retr..

[24]  Barbara J. Grosz,et al.  TEAM: A Transportable Natural-Language Interface System , 1983, ANLP.

[25]  J. Margolis,et al.  Citation Indexing and Evaluation of Scientific Papers , 1967, Science.

[26]  Michael E. Lesk,et al.  Computer Evaluation of Indexing and Text Processing , 1968, JACM.

[27]  Eugene Garfield,et al.  Citation indexing - its theory and application in science, technology, and humanities , 1979 .