Evaluating Self-Organizing Map Quality Measures as Convergence Criteria

The self-organizing map (SOM) is a type of artificial neural network that has applications in a variety of fields and disciplines. The SOM algorithm uses unsupervised learning to produce a low-dimensional representation of high-dimensional data by “fitting” a grid of nodes to the data over a fixed number of iterations. The low-dimensionality of the resulting map allows for a graphical presentation of the data which can be easily interpreted by humans. To ensure that these models are indeed representative of the underlying data, it is essential to evaluate the quality of the maps. Various measures have been developed that quantify a maps’ preservation of topology and neighborhoods. Little work, however, has been done comparing these measures to one another. To that end, this research shows that the quality measures used with SOM can be evaluated as convergence criteria. This is achieved by examining the underlying structure of maps that are converged under different measures. Specifically, the clusters that exist in the maps are compared with the clusters that exist in the input data. For this research, popular real world and synthetic data sets are used for training. The quality measures studied are quantization error, topographic error, topographic function, neighborhood preservation, and population-based convergence.

[1]  Georg Pölzlbauer Survey and Comparison of Quality Measures for Self-Organizing Maps , 2004 .

[2]  Hujun Yin,et al.  On the Distribution and Convergence of Feature Space in Self-Organizing Maps , 1995, Neural Computation.

[3]  Klaus Pawelzik,et al.  Quantifying the neighborhood preservation of self-organizing feature maps , 1992, IEEE Trans. Neural Networks.

[4]  O. J. Vrieze,et al.  Kohonen Network , 1995, Artificial Neural Networks.

[5]  D. Polani Measures for the organization of self-organizing maps , 2001 .

[6]  Lutz Hamel,et al.  Self-Organizing Map Convergence , 2018, Int. J. Serv. Sci. Manag. Eng. Technol..

[7]  Thomas Villmann,et al.  Topology preservation in self-organizing feature maps: exact definition and measurement , 1997, IEEE Trans. Neural Networks.

[8]  Iren Valova,et al.  CQoCO: A measure for comparative quality of coverage and organization for self-organizing maps , 2010, Neurocomputing.

[9]  Benjamin H. Ott A convergence criterion for self-organizing maps , 2012 .

[10]  T. Villmann,et al.  Topology Preservation in Self-Organizing Maps , 1999 .

[11]  Lili Zhang,et al.  Weighted differential topographic function: a refinement of topographic function , 2006, ESANN.

[12]  Gary James Jason,et al.  The Logic of Scientific Discovery , 1988 .

[13]  E. de Bodt,et al.  A Statistical Tool to Assess the Reliability of Self-Organizing Maps , 2001, WSOM.

[14]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.

[15]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[16]  Jouko Lampinen,et al.  Clustering properties of hierarchical self-organizing maps , 1992, Journal of Mathematical Imaging and Vision.

[17]  Jarkko Venna,et al.  Neighborhood Preservation in Nonlinear Projection Methods: An Experimental Study , 2001, ICANN.

[18]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[19]  Lutz Hamel,et al.  A Population Based Convergence Criterion for Self-Organizing Maps , 2012 .

[20]  Kurt Hornik,et al.  The Comprehensive R Archive Network , 2012 .

[21]  Gregory T Breard A Continuous Learning Strategy for Self-Organizing Maps Based on Convergence Windows , 2014 .

[22]  Lutz Hamel,et al.  Improved Interpretability of the Unified Distance Matrix with Connected Components To Appear Proceedings of DMIN ’ 11 , 2011 .

[23]  Dirk Eddelbuettel,et al.  Rcpp: Seamless R and C++ Integration , 2011 .

[24]  Lutz Hamel,et al.  SOM Quality Measures: An Efficient Statistical Approach , 2016, WSOM.

[25]  P. Thall,et al.  Some covariance models for longitudinal count data with overdispersion. , 1990, Biometrics.