Digital document libraries are an almost perfect application arena for unsupervised neural networks. This because many of the operations computers have to perform on text documents are classification tasks based on “noisy” input patterns. The “noise” arises because of the known inaccuracy of mapping natural language to an indexing vocabulary representing the contents of the documents. A growing number of papers is dedicated to the usage of self-organizing maps to organize the contents of such digital libraries. These papers assume the central availability of the data; an assumption that is questionable given the massive amount of available information. In this paper we describe an approach for organizing distributed digital libraries based on a system of independent self-organizing maps each of which representing just a portion of the complete digital library. Furthermore, we argue in favor of integrating these independent maps in a hierarchical fashion, again by means of self-organizing maps. The integration is based on the trained low level maps.
[1]
Dieter Merkl,et al.
Exploration of text collections with hierarchical feature maps
,
1997,
SIGIR '97.
[2]
Timo Honkela,et al.
Creating an Order in Digital Libraries with Self-Organizing Maps
,
1996
.
[3]
T. Kohonen.
Self-organized formation of topographically correct feature maps
,
1982
.
[4]
Gerard Salton,et al.
Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer
,
1989
.
[5]
Timo Honkela,et al.
Very Large Two-Level SOM for the Browsing of Newsgroups
,
1996,
ICANN.
[6]
Dieter Merkl,et al.
A Connectionist View on Document Classification
,
1995,
Australasian Database Conference.
[7]
Gary Marchionini,et al.
A self-organizing semantic map for information retrieval
,
1991,
SIGIR '91.