Supervised Learning from Clustered Input Examples

In this paper we analyse the effect of introducing a structure in the input distribution on the generalization ability of a simple perceptron. The simple case of two clusters of input data and a linearly separable rule is considered. We find that the generalization ability improves with the separation between the clusters, and is bounded from below by the result for the unstructured case, recovered as the separation between clusters vanishes. The asymptotic behaviour for large training sets, however, is the same for structured and unstructured input distributions. For small training sets, the dependence of the generalization error on the number of examples is observed to be non-monotonic for certain values of the model parameters.