A modular sequence of retrieval procedures to delineate a scientific field : from vocabulary to citations and back

This communication presents a modular arrangement of lexical and citation operations to achieve a satisfactory delineation of a scientific field. Three querying methods are considered: on journals, on articles vocabulary, and on citation network. Rather than associating these querying modes, we consider possible sequences that make the best use of each method. At any stage of iteration, the retrieved set can be enhanced by either a citation-based analysis, or a vocabulary analysis (relative dictionaries), in order to reduce silence or noise. General noisereduction techniques, such as clustering, are applicable at various points of the procedure. A particular sequence on a complex field (nanosciences) is described, starting with journal and lexical query, then applying a citation expansion with a final lexical adjustment. Another sequence is sketched, starting with an additional journalbased procedure.

[1]  Michel Zitt,et al.  Bridging citation and reference distributions: Part I - The referencing-structure function and its application to co-citation and co-item studies , 2004, Scientometrics.

[2]  Michel Zitt,et al.  Mapping nanosciences by citation flows: A preliminary analysis , 2007, Scientometrics.

[3]  Terje Bruen Olsen,et al.  Validation of Bibliometric Indicators in the Field of Microbiology: A Norwegian Case Study , 2004, Scientometrics.

[4]  Henk F. Moed,et al.  Delimitation of scientific subfields using cognitive words from corporate addresses in scientific publications , 2005, Scientometrics.

[5]  J. Vergne,et al.  Regards Théoriques sur le "Tagging" , 1998 .

[6]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[7]  T. V. Leeuwen,et al.  The use of combined bibliometric methods in research funding policy , 2001 .

[8]  Henk F. Moed,et al.  Mapping of science by combined co-citation and word analysis, I. Structural aspects , 1991, J. Am. Soc. Inf. Sci..

[9]  Alain Lelu Clusters and factors: neural algorithms for a novel representation of huge and highly multidimensional data sets , 1994 .

[10]  Michel Zitt,et al.  Delineating complex scientific fields by an hybrid lexical-citation method: An application to nanosciences , 2006, Inf. Process. Manag..

[11]  Henk F. Moed,et al.  Measuring national output in physics: Delimitation problems , 1993, Scientometrics.

[12]  Ronald N. Kostoff,et al.  Text mining using database tomography and bibliometrics: A review , 2001 .