The multilayer perceptron as an approximation to a Bayes optimal discriminant function

The multilayer perceptron, when trained as a classifier using backpropagation, is shown to approximate the Bayes optimal discriminant function. The result is demonstrated for both the two-class problem and multiple classes. It is shown that the outputs of the multilayer perceptron approximate the a posteriori probability functions of the classes being trained. The proof applies to any number of layers and any type of unit activation function, linear or nonlinear.