Differential theory of learning for efficient neural network pattern recognition

We describe a new theory of differential learning by which a broad family of pattern classifiers (including many well-known neural network paradigms) can learn stochastic concepts efficiently. We describe the relationship between a classifier's ability to generate well to unseen test examples and the efficiency of the strategy by which it learns. We list a series of proofs that differential learning is efficient in its information and computational resource requirements, whereas traditional probabilistic learning strategies are not. The proofs are illustrated by a simple example that lends itself to closed-form analysis. We conclude with an optical character recognition task for which three different types of differentially generated classifiers generalize significantly better than their probabilistically generated counterparts.