Using content analysis to investigate the research paths chosen by scientists over time

We present an application of a clustering technique to a large original dataset of SCI publications which is capable at disentangling the different research lines followed by a scientist, their duration over time and the intensity of effort devoted to each of them. Information is obtained by means of software-assisted content analysis, based on the co-occurrence of words in the full abstract and title of a set of SCI publications authored by 650 American star-physicists across 17 years. We estimated that scientists in our dataset over the time span contributed on average to 16 different research lines lasting on average 3.5 years and published nearly 5 publications in each single line of research. The technique is potentially useful for scholars studying science and the research community, as well as for research agencies, to evaluate if the scientist is new to the topic and for librarians, to collect timely biographic information.

[1]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[2]  Bluma C. Peritz On the Objectives of Citation Analysis: Problems of Theory and Method , 1992 .

[3]  R. Levi‐montalcini In praise of imperfection: my life and work , 1988 .

[4]  Diana Hicks,et al.  Does University-Industry Collaboration Adversely Affect University Research?. , 1999 .

[5]  Edie M. Rasmussen,et al.  Clustering Algorithms , 1992, Information Retrieval: Data Structures & Algorithms.

[6]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[7]  M. F. Fox Publication Productivity among Scientists: A Critical Review , 1983 .

[8]  Kevin W. Boyack,et al.  Generation of large-scale maps of science and associated indicators. , 2005 .

[9]  Benoît Godin,et al.  The emergence of S&T indicators: why did governments supplement statistics with indicators? , 2003 .

[10]  J. R. Cole,et al.  Scientific output and recognition: a study in the operation of the reward system in science. , 1967, American sociological review.

[11]  Eugene Garfield,et al.  Citation indexing - its theory and application in science, technology, and humanities , 1979 .

[12]  Edward J. Hackett,et al.  Tokamaks and turbulence: research ensembles, policy and technoscientific work , 2004 .

[13]  L. Vaccarezza The new production of knowledge. The dinamics of science and research in contemporary societies, Michael Gibbons, Camille Limoges, Hega Nowotny, Simon Schwartzman, Peter Scott y Martin Trow, Londres, SAGE Publications, 1994, 179 páginas. , 1995 .

[14]  P. Allison,et al.  Productivity Differences Among Scientists: Evidence for Accumulative Advantage , 1974 .

[15]  John J. McGonagle Keeping abreast of science and technology: Technical intelligence for business , 1997 .

[16]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[17]  Loet Leydesdorff,et al.  Indicators of structural change in the dynamics of science: Entropy statistics of the SCI Journal Citation Reports , 2009, Scientometrics.

[18]  Francis Narin,et al.  Bibliometric performance measures , 1996, Scientometrics.

[19]  Diana Crane,et al.  Invisible colleges. Diffusion of knowledge in scientific communities , 1972, Medical History.

[20]  W. Hagstrom The scientific community , 1966 .

[21]  J. Ziman,et al.  Public knowledge. An essay concerning the social dimension of science , 1970, Medical History.

[22]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[23]  H. Moed,et al.  The use of bibliometric data for the measurement of university research performance , 1985 .

[24]  C. Garner,et al.  Academic Publication, Market Signaling, and Scientific Research Decisions , 1979 .

[25]  Gabriel Pinski,et al.  Structure of the Biomedical Literature , 1976, J. Am. Soc. Inf. Sci..

[26]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[27]  A. Elzinga The New Production of Knowledge. The Dynamics of Science and Research in Contemporary Societies , 1997 .

[28]  Edward J. Hackett,et al.  Essential Tensions , 2005 .

[29]  Rebecca Henderson,et al.  Reprinted Article Putting patents in context: Exploring knowledge transfer from MIT , 2009 .

[30]  Paula E. Stephan,et al.  Striking the Mother Lode in Science: The Importance of Age, Place, and Time. , 1993 .

[31]  R. Merton The Matthew Effect in Science , 1968, Science.