The nature of indexing: how humans and machines analyze messages and texts for retrieval - Part I: Research, and the nature of human indexing

Abstract Does human intellectual indexing have a continuing role to play in the face of increasingly sophisticated automatic indexing techniques? In this two-part essay, a computer scientist and long-time TREC participant (Perez-Carballo) and a practitioner and teacher of human cataloging and indexing (Anderson) pursue this question by reviewing the opinions and research of leading experts on both sides of this divide. We conclude that human analysis should be used on a much more selective basis, and we offer suggestions on how these two types of indexing might be allocated to best advantage. Part one of the essay critiques the comparative research, then explores the nature of human analysis of messages or texts and efforts to formulate rules to make human practice more rigorous and predictable. We find that research comparing human vs automatic approaches has done little to change strongly held beliefs, in large part because many associated variables have not been isolated or controlled. Part II focuses on current methods in automatic indexing, its gradual adoption by major indexing and abstracting services, and ways for allocating human and machine approaches. Overall, we conclude that both approaches to indexing have been found to be effective by researchers and searchers, each with particular advantages and disadvantages. However automatic indexing has the over-arching advantage of decreasing cost, as human indexing becomes ever more expensive.

[1]  Bella Hass Weinberg Indexing : The state of our knowledge and the state of our ignorance , 1989 .

[2]  P. Wilson Two kinds of power : an essay on bibliographical control , 1978 .

[3]  Paul B. Kantor,et al.  A study of information seeking and retrieving. II. Users, questions, and effectiveness , 1988 .

[4]  William S. Cooper,et al.  Indexing documents by gedanken experimentation , 1978, J. Am. Soc. Inf. Sci..

[5]  Patrick Wilson,et al.  Some Fundamental Concepts of Information Retrieval. , 1978 .

[6]  Elaine Svenonius,et al.  Theory of Subject Analysis: A Sourcebook , 1985 .

[7]  Paul B. Kantor,et al.  A study of information seeking and retrieving. I. Background and methodology , 1997, J. Am. Soc. Inf. Sci..

[8]  Donna Harman,et al.  Information Processing and Management , 2022 .

[9]  Hans H. Wellisch,et al.  Indexing from A to Z , 1991 .

[10]  K. Markey Interindexer consistency tests: a literature review and report of a test of consistency in indexing visual materials , 1984 .

[11]  H. L. Minton,et al.  Queer Theory , 1997 .

[12]  John F. Farrow,et al.  A Cognitive Process Model of Document Indexing , 1991, J. Documentation.

[13]  Nancy C. Mulvany,et al.  Indexing Books , 1994 .

[14]  Raya Fidel Searchers' selection of search keys: I. The selection routine , 1991 .

[15]  Bella Hass Weinberg Explorations in indexing and abstracting: Pointing, virtue, and power , 1997 .

[16]  Lourdes Y. Collantes Degree of Agreement in Naming Objects and Concepts for Information Retrieval , 1995, J. Am. Soc. Inf. Sci..

[17]  Brian C. O'Connor Explorations in Indexing and Abstracting: Pointing, Virtue, and Power , 1996 .

[18]  Birger Hjørland,et al.  Information Seeking and Subject Representation: An Activity-Theoretical Approach to Information Science , 1997 .

[19]  Bella Hass Weinberg,et al.  Challenges in indexing electronic text and images , 1994 .

[20]  Jean-Marie Cellier,et al.  Expertise and Strategies for the Identification of the Main Ideas in Document Indexing , 1996 .

[21]  Jean-Marie Cellier,et al.  Psychological approach to indexing: effects of the operator's expertise upon indexing behaviour , 1995, J. Inf. Sci..

[22]  G. Haggerty,et al.  Gay histories and cultures : an encyclopedia , 2000 .

[23]  Steve Hogan,et al.  Completely Queer: The Gay and Lesbian Encyclopedia , 1998 .

[24]  Paul B. Kantor,et al.  A study of information seeking and retrieving. III. Searchers, searches, and overlap , 1988, J. Am. Soc. Inf. Sci..

[25]  Paul B. Kantor,et al.  A study of information seeking and retrieving. I. background and methodology , 1988 .

[26]  A. C. Foskett,et al.  The subject approach to information , 1969 .

[27]  Tefko Saracevic,et al.  Individual Differences in Organizing, Searching and Retrieving Information. , 1991 .

[28]  Suzanne Bertrand-Gastaldy Convergent Theories: Using a Multidisciplinary Approach to Explain Indexing Results. , 1995 .

[29]  Clare Beghtol,et al.  Bibliographic Classification Theory and Text Linguistics: Aboutness Analysis, intertextuality and the Cognitive Act of Classifying Documents , 1986, J. Documentation.

[30]  Mirja Iivonen,et al.  Consistency in the Selection of Search Concepts and Search Terms , 1995, Information Processing & Management.

[31]  Dagobert Soergel,et al.  Organizing information - principles of data base and retrieval systems , 1985 .

[32]  Pauline A. Cochrane Can you recommend a good book on indexing? Collected reviews on the organization of information , 1999 .

[33]  R. Fugmann Subject analysis and indexing: Theoretical foundation and practical advice , 1993 .

[34]  Lois Mai Chan,et al.  Cataloging and Classification: An Introduction , 1994 .

[35]  Arlene G. Taylor,et al.  The Organization of Information , 1999 .

[36]  Lawrence E. Leonard,et al.  Inter-indexer consistency studies, 1954-1975: a review of the literature and summary of study results , 1977 .

[37]  Claire David,et al.  Inedxing as Problem Solving: A Cognitive Approach to Consistency , 2013 .

[38]  Judith A. Tessier Indexing: The state of our knowledge and the state of our ignorance , 1990 .

[39]  Ray Markey Unions and Planning for Technological Change in Australia , 1984 .

[40]  Bella Hass Weinberg Can You Recommend a Good Book on Indexing?: Collected Reviews on the Organization of Information , 1998 .

[41]  Bernd Frohmann,et al.  Rules of Indexing: a Critique of Mentalism in Information Retrieval Theory , 1990, J. Documentation.

[42]  Raya Fidel,et al.  Challenges in Indexing Electronic Text and Images , 1994 .