Fast winner search for SOM-based monitoring and retrieval of high-dimensional data

Self-organizing maps (SOMs) are widely used in engineering and data-analysis tasks, but so far rarely in very large-scale problems. The reason is the amount of computation. Winner search, finding the position of a data sample on the map, is the computational bottleneck: comparison between the data vector and all of the model vectors of the map is required. In this paper a method is proposed for reducing the amount of computation by restricting the search to certain small-dimensional subspaces of the original space. The method is suitable for applications in which the map can be computed off-line, for instance, in data monitoring, classification, and information retrieval. In a case study with the WEBSOM system that organizes text document collections on a SOM, the amount of computation was reduced to about 14% of the original, and even to 6.6% when approximations were utilized.