Why Information Retrieval Needs Cognitive Science: A call to arms

Much of today’s success in Information Retrieval (IR) comes from a hard approach: employing blazingly fast machines, ever more refined statistics, and increasingly powerful classification schemes. In recent years, however, the hard approach has entered a phase of diminishing returns. This paper explores a softer alternative which, we argue, is still in the phase of increasing returns. As the quality of an IR system is ultimately decided by its users, the approach starts from how these users structure information. Interestingly, for this approach many useful principles are readily available in the psychological literature. We illustrate the approach with three examples. The first applies the cognitive status of ‘complex nominals’ to improve search results by automatically constructing specialized queries. The second shows how the connection between language and imagery at the ‘basic level’ can be used for multimedia retrieval on the World Wide Web. The final example employs the notion of ’semantic space’ to make retrieval more effective especially for large scale corpora. In each example the results were substantial. The cases we studied illustrate how an approach to information retrieval based on cognitive principles can lead to significant, immediate, and fundamental results. It shows how prolific the application of cognitive science to the core of IR can be, and we believe that both disciplines stand to benefit from this approach.

[1]  M. J. Peterson,et al.  Visual detection and visual imagery. , 1974, Journal of Experimental Psychology.

[2]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[3]  John Riedl,et al.  TREC-3: Experience With Conceptual Relations in Information Retrieval , 1994, TREC.

[4]  W. Klein,et al.  The Basic Variety (or: Couldn't natural languages be much simpler?) , 1997 .

[5]  W. Bruce Croft Effective Text Retrieval Based on Combining Evidence from the Corpus and Users , 1995, IEEE Expert.

[6]  Lambert Schomaker,et al.  Supporting content retrieval from WWW via “basic level categories” (poster abstract) , 1999, SIGIR '99.

[7]  B. Tversky,et al.  Objects, parts, and categories. , 1984 .

[8]  Curt Burgess,et al.  Explorations in context space: Words, sentences, discourse , 1998 .

[9]  C. Osgood,et al.  The Measurement of Meaning , 1958 .

[10]  B. Tversky,et al.  Journal of Experimental Psychology : General VOL . 113 , No . 2 JUNE 1984 Objects , Parts , and Categories , 2005 .

[11]  Eduard Hoenkamp,et al.  Finding relevant passages using noun-noun compounds (poster session): coherence vs. proximity , 2000, SIGIR '00.

[12]  Wayne D. Gray,et al.  Basic objects in natural categories , 1976, Cognitive Psychology.

[13]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[14]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[15]  Sung-Hyon Myaeng,et al.  DR-LINK: A System Update for TREC-2 , 1993, TREC.

[16]  Edward Hoenkamp Spotting Ontological Lacunae through Spectrum Analysis of Retrieved Documents , 2007 .

[17]  Mary Ellen Ryder Ordered chaos : a cognitive model for the interpretation of English noun-noun compounds , 1989 .

[18]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[19]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[20]  Edward E. Smith,et al.  Basic-level superiority in picture categorization , 1982 .

[21]  P. Ingwersen Cognitive Information Retrieval. , 1999 .

[22]  Susan T. Dumais,et al.  Data-driven approaches to information access , 2003, Cogn. Sci..

[23]  Peter Gärdenfors,et al.  Conceptual spaces - the geometry of thought , 2000 .

[24]  Judith N. Levi,et al.  The syntax and semantics of complex nominals , 1978 .