A study of the use of self-organising maps in information retrieval

Purpose – The aim of this paper is to explore the possibility of retrieving information with Kohonen self‐organising maps, which are known to be effective to group objects according to their similarity or dissimilarity.Design/methodology/approach – After conventional preprocessing, such as transforming into vector space, documents from a German document collection were trained for a neural network of Kohonen self‐organising map type. Such an unsupervised network forms a document map from which relevant objects can be found according to queries.Findings – Self‐organising maps ordered documents to groups from which it was possible to find relevant targets.Research limitations/implications – The number of documents used was moderate due to the limited number of documents associated to test topics. The training of self‐organising maps entails rather long running times, which is their practical limitation. In future, the aim will be to build larger networks by compressing document matrices, and to develop docu...

[1]  Gary Marchionini,et al.  A self-organizing semantic map for information retrieval , 1991, SIGIR '91.

[2]  Jorma Laaksonen,et al.  SOM_PAK: The Self-Organizing Map Program Package , 1996 .

[3]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[4]  Samuel Kaski,et al.  Dimensionality reduction by random mapping: fast similarity computation for clustering , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[5]  Michael Heine,et al.  Finding Out About: A Cognitive Perspective on Search Engine Technology and the WWW , 2002, J. Documentation.

[6]  Samuel Kaski,et al.  Mining massive document collections by the WEBSOM method , 2004, Inf. Sci..

[7]  Samuel Kaski,et al.  Self organization of a massive document collection , 2000, IEEE Trans. Neural Networks Learn. Syst..

[8]  Félix de Moya Anegón,et al.  Document organization using Kohonen's algorithm , 2002, Inf. Process. Manag..

[9]  Eija Airio Word normalization and decompounding in mono- and bilingual IR , 2006, Information Retrieval.

[10]  Esa Alhoniemi,et al.  SOM Toolbox for Matlab 5 , 2000 .

[11]  Xia Lin,et al.  Map Displays for Information Retrieval , 1997, J. Am. Soc. Inf. Sci..

[12]  Timo Honkela,et al.  Self-Organizing Maps In Natural Language Processing , 1997 .

[13]  Evaristo Jiménez-Contreras,et al.  A connectionist and multivariate approach to science maps: the SOM, clustering and MDS applied to library and information science research , 2006, J. Inf. Sci..

[14]  José Ranilla,et al.  Experiments with Self Organizing Maps in CLEF 2003 , 2003, CLEF.

[15]  Nirmalya Chowdhury,et al.  Unsupervised Text Classification Using Kohonen's Self Organizing Network , 2005, CICLing.

[16]  Hsin-Chang Yang,et al.  A Web text mining approach based on self-organizing map , 1999, WIDM '99.