Spanish personal name variations in national and international biomedical databases: implications for information retrieval and bibliometric studies.

OBJECTIVES The study sought to investigate how Spanish names are handled by national and international databases and to identify mistakes that can undermine the usefulness of these databases for locating and retrieving works by Spanish authors. METHODS The authors sampled 172 articles published by authors from the University of Granada Medical School between 1987 and 1996 and analyzed the variations in how each of their names was indexed in Science Citation Index (SCI), MEDLINE, and Indice Medico Español (IME). The number and types of variants that appeared for each author's name were recorded and compared across databases to identify inconsistencies in indexing practices. We analyzed the relationship between variability (number of variants of an author's name) and productivity (number of items the name was associated with as an author), the consequences for retrieval of information, and the most frequent indexing structures used for Spanish names. RESULTS The proportion of authors who appeared under more then one name was 48.1% in SCI, 50.7% in MEDLINE, and 69.0% in IME. Productivity correlated directly with variability: more than 50% of the authors listed on five to ten items appeared under more than one name in any given database, and close to 100% of the authors listed on more than ten items appeared under two or more variants. Productivity correlated inversely with retrievability: as the number of variants for a name increased, the number of items retrieved under each variant decreased. For the most highly productive authors, the number of items retrieved under each variant tended toward one. The most frequent indexing methods varied between databases. In MEDLINE and IME, names were indexed correctly as "first surname second surname, first name initial middle name initial" (if present) in 41.7% and 49.5% of the records, respectively. However, in SCI, the most frequent method was "first surname, first name initial second name initial" (48.0% of the records) and first surname and second surname run together, first name initial (18.3%). CONCLUSIONS Retrievability on the basis of author's name was poor in all three databases. Each database uses accurate indexing methods, but these methods fail to result in consistency or coherence for specific entries. The likely causes of inconsistency are: (1) use by authors of variants of their names during their publication careers, (2) lack of authority control in all three databases, (3) the use of an inappropriate indexing method for Spanish names in SCI, (4) authors' inconsistent behaviors, and (5) possible editorial interventions by some journals. We offer some suggestions as to how to avert the proliferation of author name variants in the databases.

[1]  Janne S. Kotiaho,et al.  Unfamiliar citations breed mistakes , 1999, Nature.

[2]  Brown Cm Complementary use of the SciSearch database for improved biomedical information searching. , 1998 .

[3]  C. M. Brown Complementary use of the SciSearch database for improved biomedical information searching. , 1998, Bulletin of the Medical Library Association.

[4]  J. Camí,et al.  [Medicina Clinica (1992-1993) seen through the Science Citation Index]. , 1997, Medicina clinica.

[5]  A. Pestaña [Suitability of MEDLINE for the study of the Spanish scientific production in biomedicine and medical sciences. A comparative appraisal with the Science Citation Index]. , 1997, Medicina clinica.

[6]  Marc Rittberger,et al.  Measuring quality in the production of databases , 1997, J. Inf. Sci..

[7]  Péter Jacsó,et al.  Content Evaluation of Databases. , 1997 .

[8]  E. D. López-Cózar Incidencia de la normalización de las revistas científicas en la transferencia y evaluación de la información científica , 1997 .

[9]  E. Spinak Errores ortográficos en el ingreso en bases de datos , 1995 .

[10]  R. Meneghini Systematization of academic and scientific affiliation, or how to prevent data on your publications from being lost in the national and international data base. , 1995, Brazilian journal of medical and biological research = Revista brasileira de pesquisas medicas e biologicas.

[11]  Plergiorgio Strata,et al.  Citation analysis , 1995, Nature.

[12]  Nancy E. Barr Standards for the international exchange of bibliographic information , 1993 .

[13]  Christine L. Borgman,et al.  Getty's Synoname™ and its cousins: A survey of applications of personal name‐matching algorithms , 1992 .

[14]  J. M. López Piñero,et al.  [Bibliometric indicators and the evaluation of medical scientific activity. (III). The indicators of information production, circulation and dispersion, consumption and the repercussions]. , 1992, Medicina clinica.

[15]  G. Silva NOMBRES DE PILA COMPLETOS : LAS INICIALES NO BASTAN , 1992 .

[16]  A. B. Piternick Name of an author , 1992 .

[17]  Ed Jones Consistency in choice and form of main entry : a comparison of library of congress and british library monograph cataloging , 1992 .

[18]  Christine L. Borgman,et al.  Getty's Synoname and Its Cousins: A Survey of Applications of Personal Name-Matching Algorithms , 1992, J. Am. Soc. Inf. Sci..

[19]  Alejandro de la Cueva Martín,et al.  La documentación médica española. El "Indice Médico Español" y el estudio de la actividad científica , 1991 .

[20]  T. S. Weintraub,et al.  Personal name variations : implications for authority control in computerized catalogs , 1991 .

[21]  James H. Sweetland,et al.  Errors in Bibliographic Citations: A Continuing Problem , 1989, The Library Quarterly.

[22]  Michael H. MacRoberts,et al.  Problems of citation analysis: A critical review , 1989, JASIS.

[23]  Henk F. Moed,et al.  Possible inaccuracies occurring in citation analysis , 1989, J. Inf. Sci..

[24]  Elizabeth E. Fuller Variation in Personal Names in Works Represented in the Catalog , 1989 .

[25]  Susan C. Speer,et al.  Bibliographic Verification for Interlibrary Loan: Is it Necessary? , 1988 .

[26]  E. O'Neill,et al.  Quality control in online databases , 1988 .

[27]  M. Valero,et al.  Las bases de datos como fuentes de información para estudios bibliométricos , 1988 .

[28]  David M. Pilachowski,et al.  What's in a Name? Looking for People Online--Social Sciences. , 1985 .

[29]  Anne B. Piternick What's in a Name? Use of Names and Titles in Subject Searching. , 1985 .

[30]  Melinda L. Shore Variation between personal name headings and title page usage , 1984 .

[31]  Martha E. Williams,et al.  Lack of standardization of the journal title data element in databases , 1981, J. Am. Soc. Inf. Sci..

[32]  Donald T. Hawkins,et al.  Unconventional uses of on-line information retrieval systems: On-line bibliometric studies , 1977, J. Am. Soc. Inf. Sci..

[33]  Michael Gorman,et al.  Anglo-American Cataloguing Rules , 1967 .