On the use of bibliographically related titles for the enhancement of document representations

Abstract A recent article by Salton and Zhang compared some retrieval results for document collections indexed using title and abstract only with that of the same indexing enhanced by terms from bibliographically related titles. They observed that the method of adding related title terms is not sufficiently reliable. We feel that their conclusion may be overly pessimistic because it was based on small negative results using a 146-document “soft science” (information science) sample collection and without the benefit of term selection among the bibliographically related titles. On the other hand, we believe that their positive results with a 3204-document computer science collection is significant and may serve as new evidence that the method may actually work for scientific literature. Discussions on some strategies for term selection are given, as well as reasons why cited titles are useful and preferred over other types of related titles.

[1]  Michael E. Lesk,et al.  Computer Evaluation of Indexing and Text Processing , 1968, JACM.

[2]  John O'Connor Biomedical citing statements: Computer recognition and use to aid full-text retrieval , 1983, Inf. Process. Manag..

[3]  Eugene Garfield,et al.  Citation indexing - its theory and application in science, technology, and humanities , 1979 .

[4]  Kui-Lam Kwok,et al.  A probabilistic theory of indexing and similarity measure based on cited and citing documents , 1985, J. Am. Soc. Inf. Sci..

[5]  M. M. Kessler Bibliographic coupling between scientific papers , 1963 .

[6]  Mary Jane Ruhl,et al.  Chemical documents and their titles: Human concept indexing vs. KWIC‐machine indexing , 1964 .

[7]  Y. Zhang,et al.  Enhancement of text representations using related document titles , 1986, Inf. Process. Manag..

[8]  Kui-Lam Kwok The use of title and cited titles as document representation for automatic classification , 1975, Inf. Process. Manag..

[9]  Richard A. V. Diener,et al.  Informational dynamics of journal article titles , 1984, J. Am. Soc. Inf. Sci..

[10]  Samuel Schiminovich Automatic classification and retrieval of documents by means of a bibliographic pattern discovery algorithm , 1971, Inf. Storage Retr..

[11]  G. Olive,et al.  STUDIES TO COMPARE RETRIEVAL USING TITLES WITH THAT USING INDEX TERMS , 1973 .

[12]  Francis Narin,et al.  Clustering of scientific journals , 1973, J. Am. Soc. Inf. Sci..

[13]  Edward A. Fox,et al.  Characterization of Two New Experimental Collections in Computer and Information Science Containing Textual and Bibliographic Concepts , 1983 .

[14]  Edward Fox,et al.  Extending the boolean and vector space models of information retrieval with p-norm queries and multiple concept types , 1983 .

[15]  Jacques J. Tocatlian,et al.  Are titles of chemical papers becoming more informative , 1970 .

[16]  Virgil Diodato The Occurrence of title Words in parts of Research Papers: variations among disciplines , 1982, J. Documentation.

[17]  Donald B. Cleveland,et al.  Less than full-text indexing using a non-boolean searching model , 1984, J. Am. Soc. Inf. Sci..

[18]  B. C. Griffith,et al.  The Structure of Scientific Literatures I: Identifying and Graphing Specialties , 1974 .

[19]  Henry G. Small,et al.  The relationship of information science to the social sciences: A co-citation analysis , 1981, Inf. Process. Manag..

[20]  Clement T. Yu,et al.  On the Construction of Feedback Queries , 1982, JACM.

[21]  J. Bavelas The social psychology of citations. , 1978 .

[22]  A. Neil Yerkey Models of index searching and retrieval effectiveness of keyword-in-context indexes , 1973, J. Am. Soc. Inf. Sci..

[23]  Donald H. Kraft A comparison of keyword‐in‐context (KWIC) indexing of titles with a subject heading classification system , 1964 .

[24]  John O'Connor,et al.  Citing statements: Computer recognition and use to improve retrieval , 1982, Inf. Process. Manag..

[25]  Don R. Swanson,et al.  Machinelike indexing by people , 1962 .

[26]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[27]  Kui-Lam Kwok,et al.  Experiments with cited titles for automatic document indexing and similarity measure in a probabilistic context , 1985, SIGIR '85.

[28]  A. J. Meadows,et al.  THE VARIATION IN THE INFORMATION CONTENT OF TITLES OF RESEARCH PAPERS WITH TIME AND DISCIPLINE , 1977 .

[29]  Alan F. Smeaton,et al.  The Retrieval Effects of Query Expansion on a Feedback Document Retrieval System , 1983, Comput. J..

[30]  Henry G. Small,et al.  Co-citation in the scientific literature: A new measure of the relationship between two documents , 1973, J. Am. Soc. Inf. Sci..

[31]  Frances H. Barker,et al.  COMPARATIVE EFFICIENCY OF SEARCHING TITLES, ABSTRACTS, AND INDEX TERMS IN A FREE‐TEXT DATA BASE , 1972 .