Indexing and Access for Digital Libraries and the Internet: Human, Database, and Domain Factors

Discussion in the research community and among the general public regarding content indexing (especially subject indexing) and access to digital resources, especially on the Internet, has underutilized research on a variety of factors that are important in the design of such access mechanisms. Some of these factors and issues are reviewed and implications drawn for information system design in the era of electronic access. Specifically the following are discussed: Human factors: Subject searching vs. indexing, multiple terms of access, folk classification, basic-level terms, and folk access; Database factors: Bradford's Law, vocabulary scalability, the Resnikoff-Dolby 30:1 Rule; Domain factors: Role of domain in indexing.

[1]  Clifford A. Lynch,et al.  Interoperability, Scaling, and the Digital Libraries Research Agenda. , 1996 .

[2]  Nicholas J. Belkin,et al.  Ask for Information Retrieval: Part I. Background and Theory , 1997, J. Documentation.

[3]  Gavriel Salvendy,et al.  Hierarchical Menu Design: Breadth, Depth, and Task Complexity , 1996 .

[4]  A. J. Meadows Communication in science , 1974 .

[5]  D. E. Breedlove,et al.  The Origins of Taxonomy , 1971, Science.

[6]  Raya Fidel User-centered indexing , 1994 .

[7]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[8]  Michael J. Nelson,et al.  Correlation of term usage and term indexing frequencies , 1988, Inf. Process. Manag..

[9]  Donald L. Fisher,et al.  Optimal Menu Hierarchy Design: Syntax and Semantics , 1990 .

[10]  Christine L. Borgman,et al.  Why are online catalogsstill hard to use , 1996 .

[11]  Hsinchun Chen,et al.  Automatic Thesaurus Generation for an Electronic Community System , 1995, J. Am. Soc. Inf. Sci..

[12]  Carol Collier Kuhlthau,et al.  Seeking Meaning: a process approach to library and information services" Ablex Publishing , 2003 .

[13]  Ronald Rousseau,et al.  Bradford Curves , 1994, Inf. Process. Manag..

[14]  Beverly P. Lynch Participative Management in Academic Libraries. Maurice P. Marchant , 1978 .

[15]  Mary E. Brown By Any Other Name: Accounting for Failure in the Naming of Subject Categories. , 1995 .

[16]  Brian Vickery,et al.  Faceted classification : a guide to construction and use of special schemes , 1968 .

[17]  Jennifer Rowley Organizing Knowledge: An Introduction to Information Retrieval , 1987 .

[18]  Stephen E. Wiberley User Persistence in Displaying Online Catalog Postings: LUIS. , 1995 .

[19]  P. Solomon Children's information retrieval behavior: A case analysis of an OPAC , 1993 .

[20]  J. L. Dolby,et al.  On the Multiplicative Structure of Information Storage and Access Systems , 1971 .

[21]  Jean Tague-Sutcliffe,et al.  Split size-rank models for the distribution of index terms , 1985, J. Am. Soc. Inf. Sci..

[22]  Hsinchun Chen,et al.  An algorithmic approach to concept exploration in a large knowledge network (automatic thesaurus consultation): symbolic branch-and-bound search vs. connectionist Hopfield net activation , 1995 .

[23]  Marcia J. Bates System meets user: Problems in matching subject search terms , 1977, Inf. Process. Manag..

[24]  Marcia J. Bates,et al.  A profile of end‐user searching behavior by humanities scholars: The Getty Online Searching Project Report No. 2 , 1993 .

[25]  B. C. Brookes THEORY OF THE BRADFORD LAW , 1977 .

[26]  Stephanie W. Haas,et al.  Sublanguage terms: dictionaries, usage, and automatic classification , 1995 .

[27]  Jane Fedorowicz,et al.  The Theoretical Foundation of Zipf's Law and Its Application to the Bibliographic Database Environment , 2007, J. Am. Soc. Inf. Sci..

[28]  Stephen E. Wiberley User persistence in scanning postings of a computer-driven information system: LCS , 1990 .

[29]  K. Markey Interindexer consistency tests: a literature review and report of a test of consistency in indexing visual materials , 1984 .

[30]  S. P. Harter Psychological relevance and information science , 1992 .

[31]  P. Wilson Two kinds of power : an essay on bibliographical control , 1978 .

[32]  Micheline Hancock-Beaulieu,et al.  Interactive thesaurus navigation: intelligence rules ok? , 1995 .

[33]  Marcia J. Bates,et al.  The Getty End-User Online Searching Project in the Humanities: Report No. 6: Overview and Conclusions , 1996 .

[34]  Karen Markey Subject searching in library catalogs : before and after the introduction of online catalogs , 1984 .

[35]  Marcia J. Bates,et al.  What Is a Reference Book? A Theoretical and Empirical Analysis. , 1986 .

[36]  JonesSusan,et al.  Interactive thesaurus navigation , 1995 .

[37]  David Ellis,et al.  The Dilemma of Measurement in Information Retrieval Research , 1996, J. Am. Soc. Inf. Sci..

[38]  Marcia J. Bates,et al.  Subject access in online catalogs: A design model , 1986 .

[39]  David C. Blair STAIRS redux: thoughts on the STAIRS evaluation, ten years after , 1996 .

[40]  Marcia J. Bates,et al.  THE DESIGN OF DATABASES AND OTHER INFORMATION RESOURCES FOR HUMANITIES SCHOLARS: THE GETTY ONLINE SEARCHING PROJECT REPORT NO. 4 , 1994 .

[41]  Klatt Mj An aid for total quality searching: developing a hedge book. , 1994 .

[42]  Paul B. Kantor,et al.  A Study of Information Seeking and Retrieving. III. Searchers, Searches, and Overlap* , 1988 .

[43]  Oliver L. Lilley Evaluation of the subject catalog. Criticisms and a proposal , 1954 .

[44]  Marcia J. Bates,et al.  Document Familiarity, Relevance, and Bradford's Law: the Getty Online Searching Project Report Number 5 , 1996, Inf. Process. Manag..

[45]  Hsinchun Chen,et al.  User Misconceptions of Information Retrieval Systems , 1988, Int. J. Man Mach. Stud..

[46]  Elizabeth D. Liddy,et al.  A Sublanguage Approach to Natural Language Processing for an Expert System , 1993, Inf. Process. Manag..

[47]  Wayne D. Gray,et al.  Basic objects in natural categories , 1976, Cognitive Psychology.

[48]  Christine L. Borgman,et al.  Why are online catalogs still hard to use , 1996 .

[49]  Mary Elizabeth Stevens,et al.  Automatic indexing : a state-of-the art report , 1965 .

[50]  Jaana Kristensen,et al.  Expanding End-Users' Query Statements for Free Text Searching with a Search-Aid Thesaurus , 1993, Inf. Process. Manag..

[51]  Marcia J. Bates,et al.  Rethinking Subject Cataloging in the Online Environment. , 1989 .

[52]  Carol A. Hert User Goals on an Online Public Access Catalog , 1996, J. Am. Soc. Inf. Sci..

[53]  Richard Kittredge,et al.  Sublanguage : studies of language in restricted semantic domains , 1982 .

[54]  Carol A. Bean,et al.  Topical Relevance Relationships. II. An Exploratory Study and Preliminary Typology , 1995, J. Am. Soc. Inf. Sci..

[55]  Hans Peter Luhn,et al.  A Statistical Approach to Mechanized Encoding and Searching of Literary Information , 1957, IBM J. Res. Dev..

[56]  K. J. Lynch,et al.  Automatic construction of networks of concepts characterizing document databases , 1992, IEEE Trans. Syst. Man Cybern..

[57]  Louis M. Gomez,et al.  All the Right Words: Finding What You Want as a Function of Richness of Indexing Vocabulary. , 1990 .

[58]  Sarah D. Knapp The Contemporary Thesaurus of Social Science Terms and Synonyms: A Guide for Natural Language Computer Searching , 1992 .

[59]  B. Weinberg Why indexing fails the researcher , 1988, The Indexer: The International Journal of Indexing: Volume 16, Issue 1.

[60]  Maurice P. Marchant Participative Management in Academic Libraries , 1977 .

[61]  Marcia J. Bates,et al.  The design of browsing and berrypicking techniques for the online search interface , 1989 .

[62]  J. L. Dolby,et al.  Access: A Study of Information Storage and Retrieval with Emphasis on Library Information Systems. Interim Report. , 1971 .

[63]  Yuen Ren Chao,et al.  Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology , 1950 .

[64]  Stephen E. Wiberley Subject Access in the Humanities and the Precision of the Humanist's Vocabulary , 1983, The Library Quarterly.

[65]  S. Pinker The Language Instinct , 1994 .

[66]  Stephanie W. Haas,et al.  Toward the Automatic Identification of Sublanguage Vocabulary , 1993, Inf. Process. Manag..

[67]  William O Beeman,et al.  Object, image, inquiry : the art historian at work : report on a collaborative study by the Getty Art History Information Program (AHIP) and the Institute for Research in Information and Scholarship (IRIS), Brown University , 1988 .

[68]  Terrence A. Brooks,et al.  People, Words, and Perceptions: A Phenomenological Investigation of Textuality , 1995, J. Am. Soc. Inf. Sci..

[69]  Ferdinand F. Leimkuhler,et al.  A Relationship between Lotka's Law, Bradford's Law, and Zipf's Law. , 1986 .

[70]  Paul O'Leary,et al.  Cheshire II: designing a next-generation online catalog , 1996 .

[71]  S. T. Dumais,et al.  Human factors and behavioral science: Statistical semantics: Analysis of the potential performance of key-word information systems , 1983, The Bell System Technical Journal.

[72]  A. M. Pejtersen A framework for indexing and representation of information based on work domain analysis: A fiction classification example , 1994 .

[73]  Hsinchun Chen,et al.  Interactive term suggestion for users of digital libraries: using subject thesauri and co-occurrence lists for information retrieval , 1996, DL '96.

[74]  Susan Siegfried,et al.  An Analysis of Search Terminology Used by Humanities Scholars: The Getty Online Searching Project Report Number 1 , 1993, The Library Quarterly.

[75]  Marcia J. Bates,et al.  For information specialists : interpretations of reference and bibliographic work , 1994 .

[76]  E. A. Fox,et al.  Combining the Evidence of Multiple Query Representations for Information Retrieval , 1995, Inf. Process. Manag..

[77]  Lawrence E. Leonard,et al.  Inter-indexer consistency studies, 1954-1975: a review of the literature and summary of study results , 1977 .

[78]  Mavis Molto Improving Full Text Search Performance Through Textual Analysis , 1993, Inf. Process. Manag..

[79]  Norman Kaplan,et al.  The Sociology of Science: Theoretical and Empirical Investigations , 1974 .

[80]  Donald F. Swift,et al.  A sociological approach to the design of information systems , 1979, J. Am. Soc. Inf. Sci..

[81]  Pauline A. Cochrane,et al.  A Hypertextual Interface for a Searcher's Thesaurus , 1995, Digital library.

[82]  Helen R. Tibbo,et al.  Indexing for the humanities , 1994 .

[83]  Ralph Grishman,et al.  Analyzing language in restricted domains : sublanguage description and processing , 1986 .

[84]  Stephen E. Wiberley Names in Space and Time: The Indexing Vocabulary of the Humanities , 1988, The Library Quarterly.

[85]  Marcia J. Bates,et al.  How to use controlled vocabularies more effectively in online searching , 1988 .

[86]  John A. Stewart,et al.  The Poisson-Lognormal Model for Bibliometric/Scientometric Distributions , 1994, Inf. Process. Manag..

[87]  Liwen Qiu An empirical examination of the existing models for Bradford's law , 1990, Inf. Process. Manag..

[88]  Thomas Lee Eichman The Complex Nature of Opening Reference Questions. , 1978 .

[89]  Eleanor Rosch,et al.  Principles of Categorization , 1978 .

[90]  Jack Bilmes,et al.  Classifications in Their Social Context , 1981 .

[91]  Cecil H. Brown Language and living things: Uniformities in folk classification and naming , 1984 .

[92]  H. Albrechtsen,et al.  Toward a New Horizon in Information Science: Domain-Analysis , 1995, J. Am. Soc. Inf. Sci..