Statistical analysis of self-organization

The choice of the specific form of the neighborhood function and the learning rate in the Kohonen model of the self-organizing map has been empirical, since the model is very difficult to analyze. We present a new statistically motivated approach to determine the contribution of each data presentation during training on the final position of the units of the trained map. Experimental results show that applying the commonly used learning rates to the finite training set results in unit locations overly influenced by the later presentations (i.e., last 20% of training samples). Better learning rate schedules and neighborhood functions are then determined which allow more uniform contributions of the training data on the unit locations. These improved rates are shown to be a suitable generalization of the standard rates given by stochastic approximation theory for a self-organizing map of units.