An efficient algorithm for learning invariance in adaptive classifiers

In many machine learning applications, one has not only training data but also some high-level information about certain invariances that the system should exhibit. In character recognition, for example, the answer should be invariant with respect to small spatial distortions in the input images (translations, rotations, scale changes, etcetera). The authors have implemented a scheme that minimizes the derivative of the classifier outputs with respect to distortion operators. This not only produces tremendous speed advantages, but also provides a powerful language for specifying what generalizations the network can perform.<<ETX>>