A re-weighting strategy for improving margins

We present a simple general scheme for improving margins that is inspired on well known margin theory principles. The scheme is based on a sample re-weighting strategy. The very basic idea is in fact to add to the training set new replicas of samples which are not classified with a sufficient margin. As a study case, we present a new algorithm, namely TVQ, which is an instance of the proposed scheme and involves a tangent distance based 1-NN classifier implementing a sort of quantization of the tangent distance prototypes. The tangent distance models created in this way have shown a significant improvement in generalization power with respect to standard tangent models. Moreover, the obtained models were able to outperform other state of the art algorithms, such as SVM, in an OCR task.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[3]  Robert E. Schapire,et al.  Theoretical Views of Boosting , 1999, EuroCOLT.

[4]  Patrice Y. Simard,et al.  Learning Prototype Models for Tangent Distance , 1994, NIPS.

[5]  Alessandro Sperduti,et al.  A Rapid Graph-based Method for Arbitrary Transformation-Invariant Pattern Classification , 1994, NIPS.

[6]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[7]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[8]  Holger Schwenk,et al.  Learning Discriminant Tangent Models for Handwritten Character Recognition , 1995 .

[9]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[10]  Patrice Y. Simard Efficient Computation of Complex Distance Metrics Using Hierarchical Filtering , 1993, NIPS.

[11]  Jorma Laaksonen,et al.  LVQ_PAK: The Learning Vector Quantization Program Package , 1996 .

[12]  Peter L. Bartlett,et al.  The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.

[13]  Geoffrey E. Hinton,et al.  Modeling the manifolds of images of handwritten digits , 1997, IEEE Trans. Neural Networks.

[14]  Peter L. Bartlett,et al.  Improved Generalization Through Explicit Optimization of Margins , 2000, Machine Learning.

[15]  Maurice Milgram,et al.  Transformation Invariant Autoassociation with Application to Handwritten Character Recognition , 1994, NIPS.

[16]  Alessandro Sperduti,et al.  Discriminant Pattern Recognition Using Transformation-Invariant Neurons , 2000, Neural Computation.

[17]  Alessandro Sperduti,et al.  A Constructive Learning Algorithm for Discriminant Tangent Models , 1996, NIPS.

[18]  David G. Stork,et al.  Pattern Classification , 1973 .

[19]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.