Database tomography for information retrieval

Database tomography is an information extraction and analysis system which operates on textual databases. Its primary use to date has been to identify pervasive technical thrusts and themes, and the interrelationships among these themes and sub-themes, which are intrinsic to large textual databases. Its two main algorithmic components are multiword phrase frequency analysis and phrase proximity analysis. This paper shows how database tomography can be used to enhance information retrieval from large textual databases through the newly developed process of simulated nucleation. The principles of simulated nucleation are presented, and the advantages for information retrieval are delineated. An application is described of developing, from Science Citation Index and Engineering Compendex, a database of journal articles focused on near-Earth space science and technology.

[1]  Michael Lesk,et al.  Word-word associations in document retrieval systems , 1969 .

[2]  Amanda Spink,et al.  Term Relevance Feedback and Mediated Database Searching: Implications for Information Retrieval Practice and Systems Design , 1995, Information Processing & Management.

[3]  M. E. Maron,et al.  On Relevance, Probabilistic Indexing and Information Retrieval , 1960, JACM.

[4]  Ronald N. Kostoff,et al.  Co-Word Analysis , 1993 .

[5]  Ronald N. Hostoff Database tomography for technical intelligence: Analysis of the research impact assessment literature , 1997 .

[6]  Kui-Lam Kwok,et al.  A network approach to probabilistic information retrieval , 1995, TOIS.

[7]  Ronald N. Kostoff,et al.  Database tomography: Origins and duplications , 1994 .

[8]  Edward A. Fox,et al.  Advanced feedback methods in information retrieval , 1985, J. Am. Soc. Inf. Sci..

[9]  Michael H. MacRoberts,et al.  Problems of citation analysis , 1992, Scientometrics.

[10]  Daryl E. Chubin,et al.  Research Impact Assessment , 1993 .

[11]  Ronald N. Kostoff,et al.  Assessing Research Impact , 1994 .

[12]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[13]  Amanda Spink,et al.  Term Relevance Feedback and Mediated Database Searching: Implications for Information Retrieval Practice and Systems Design , 1995, Inf. Process. Manag..

[14]  Ronald N. Kostoff The Handbook of Research Impact Assessment. Edition 7. Summer 1997. , 1997 .

[15]  W. Bruce Croft,et al.  Using Probabilistic Models of Document Retrieval without Relevance Information , 1979, J. Documentation.

[16]  Ronald N. Kostoff,et al.  Database tomography for technical intelligence , 1993 .

[17]  Ronald N. Kostoff,et al.  Use and misuse of metrics in research evaluation , 1997 .

[18]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[19]  R. Kostoff,et al.  Evaluating Federal R&D in the United States , 1993 .

[20]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[21]  H. Edmund Stiles,et al.  The Association Factor in Information Retrieval , 1961, JACM.

[22]  Alan F. Smeaton,et al.  The Retrieval Effects of Query Expansion on a Feedback Document Retrieval System , 1983, Comput. J..