Fractals Text Mining Using Bibliometrics and Database Tomography

Database Tomography (DT) is a textual database analysis system consisting of two major components: (1) algorithms for extracting multi-word phrase frequencies and phrase proximities (physical closeness of the multi-word technical phrases) from any type of large textual database, to augment (2) interpretative capabilities of the expert human analyst. DT was used to obtain technical intelligence from a Fractals database derived from the Science Citation Index/Social Science Citation Index (SCI). Phrase frequency analysis by the technical domain experts provided the pervasive technical themes of the Fractals database, and the phrase proximity analysis provided the relationships among the pervasive technical themes. Bibliometric analysis of the Fractals literature supplemented the DT results with author/journal/institution publication and citation data.

[1]  Ronald N. Kostoff,et al.  Fullerene Data Mining Using Bibliometrics and Database Tomography , 2000, J. Chem. Inf. Comput. Sci..

[2]  Eugene Garfield History of citation indexes for chemistry: a brief review , 1985, J. Chem. Inf. Comput. Sci..

[3]  Ronald N. Kostoff,et al.  The use and misuse of citation analysis in research evaluation , 1998, Scientometrics.

[4]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[5]  Peter Willett,et al.  Recent trends in hierarchic document clustering: A critical review , 1988, Inf. Process. Manag..

[6]  Ronald N. Kostoff,et al.  Database tomography for information retrieval , 1997, J. Inf. Sci..

[7]  Michael Philippsen,et al.  Finding Plagiarisms among a Set of Programs with JPlag , 2002, J. Univers. Comput. Sci..

[8]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[9]  Ronald N. Kostoff,et al.  Database tomography for technical intelligence , 1993 .

[10]  Ronald N. Kostoff,et al.  Electrochemical power text mining using bibliometrics and database tomography , 2002 .

[11]  Ronald N. Kostoff,et al.  Hypersonic and supersonic flow roadmaps using bibliometrics and database tomography , 1999 .

[12]  Ronald N. Kostoff,et al.  Database Tomography for Technical Intelligence: A Roadmap of the Near-Earth Space Science and Technology Literature , 1998, Inf. Process. Manag..

[13]  Ronald N. Kostoff,et al.  Database tomography for technical intelligence: Comparative roadmaps of the research impact assessment literature and the journal of the American Chemical Society , 2006, Scientometrics.

[14]  Michael H. MacRoberts,et al.  Problems of citation analysis , 1996, Scientometrics.

[15]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[16]  Ronald N. Kostoff,et al.  Database tomography applied to an aircraft science and technology investment strategy , 2000 .

[17]  Ronald N. Kostoff,et al.  Text mining using database tomography and bibliometrics: A review , 2001 .