Context-sensitive vocabulary mapping with a spreading activation network

A spreading activation network model is applied to the problem of reconciling heterogeneous indexing (STI index terms and ApJ subject descriptors) in a database of documents in the field of astronomy. Drawing on evidence from a set of co-indexed documents, a 3-layer feed-forward network is constructed. It includes an input term layer (source vocabulary), document layer, and output term layer (target vocabulary). Results of experiments show that the network can uncover both static, term-to-term relationships, and those that depend on the context of a particular document’s indexing. From the static mapping experiment, the asymmetric nature of term mapping is revealed. A visualization tool graphically shows complex term relationships identified by this model. The contextsensitive mapping experiment tests the robustness of the network against the removal of each document node under testing. The performance of the complete network is compared to that of the reduced network. The results imply that mapping is largely dependent on regularities emerging from the entire pattern of connections in the network rather than localist representations. The mapping from specific to general shows better performance than the mapping from general to specific. Several issues related to the model including limitations, application of a learning algorithm, and the generality of the study are discussed.

[1]  Richard K. Belew,et al.  Adaptive information retrieval: using a connectionist representation to retrieve and learn about documents , 1989, SIGIR '89.

[2]  Peretz Shoval Expert/Consultation System for a Retrieval Data-Base with Semantic Network of Concepts , 1981, SIGIR.

[3]  Panos Constantopoulos,et al.  A method for monolingual thesauri merging , 1997, SIGIR '97.

[4]  James A. Reggia,et al.  Connectionist models and information retrieval , 1990 .

[5]  Ross Wilkinson,et al.  Using the cosine measure in a neural network for document retrieval , 1991, SIGIR '91.

[6]  Guenther Eichhorn,et al.  New Capabilities of the ADS Abstract and Article Service , 1998 .

[7]  Linda C. Smith,et al.  Compatibility issues affecting information systems and services , 1983 .

[8]  David Dubin,et al.  Co-occurrence Evidence for Subject Vocabulary Reconciliation in ADS Databases , 1999 .

[9]  D. Rumelhart Parallel Distributed Processing Volume 1: Foundations , 1987 .

[10]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[11]  Philip M. Turner,et al.  Automatic linking of thesauri , 1996, SIGIR '96.

[12]  Paul H. Klingbiel,et al.  An Operational System for Subject Switching Between Controlled Vocabularies , 1993, Inf. Process. Manag..

[13]  Yiyu Yao,et al.  Computation of term associations by a neural network , 1993, SIGIR.

[14]  Kui-Lam Kwok A neural network for probabilistic information retrieval , 1989, SIGIR '89.

[15]  Michael C. Mozer,et al.  Inductive Information Retrieval Using Parallel Distributed Computation. , 1984 .

[16]  David Dubin,et al.  Addressing the Heterogeneity of Subject Indexing in the ADS Databases , 1998 .

[17]  Paul H. Klingbiel Phrase structure rewrite systems in information retrieval , 1985, Inf. Process. Manag..

[18]  Michael T. Genuardi,et al.  Machine-Aided Indexing at NASA , 1994, Inf. Process. Manag..

[19]  Paul R. Cohen,et al.  Information retrieval by constrained spreading activation in semantic networks , 1987, Inf. Process. Manag..

[20]  Robert R. Korfhage,et al.  Information Storage and Retrieval , 1963 .

[21]  Scott Everett Preece A spreading activation network model for information retrieval , 1981 .

[22]  Allan Collins,et al.  A spreading-activation theory of semantic processing , 1975 .

[23]  Drew McDermott,et al.  Introduction to artificial intelligence , 1986, Addison-Wesley series in computer science.