Conditional entropy minimization in neural network classifiers

We explore the role of entropy manipulation during learning in supervised multilayer perceptron classifiers. We show that for a 2-layer MLP classifier, conditional entropy minimization in the internal layer is a necessary condition for error minimization in the mapping from the input to the output. The relationship between entropy and the expected volume and mass of a convex hull constructed from n sample points is examined. We show that minimizing the expected hull volume may have more desirable gradient dynamics when compared to minimizing entropy. We show that entropy by itself has some geometrical invariance with respect to expected hull volumes. We develop closed form expressions for the expected convex hull mass and volumes in R/sup 1/ and relate these to error probability. Finally, we show that learning in an MLP may be accomplished solely by minimization of the conditional expected hull volumes and the expected volume of the "intensity of collision".