Shape Quantization and Recognition with Randomized Trees

We explore a new approach to shape recognition based on a virtually infinite family of binary features (queries) of the image data, designed to accommodate prior information about shape invariance and regularity. Each query corresponds to a spatial arrangement of several local topographic codes (or tags), which are in themselves too primitive and common to be informative about shape. All the discriminating power derives from relative angles and distances among the tags. The important attributes of the queries are a natural partial ordering corresponding to increasing structure and complexity; semi-invariance, meaning that most shapes of a given class will answer the same way to two queries that are successive in the ordering; and stability, since the queries are not based on distinguished points and substructures. No classifier based on the full feature set can be evaluated, and it is impossible to determine a priori which arrangements are informative. Our approach is to select informative features and build tree classifiers at the same time by inductive learning. In effect, each tree provides an approximation to the full posterior where the features chosen depend on the branch that is traversed. Due to the number and nature of the queries, standard decision tree construction based on a fixed-length feature vector is not feasible. Instead we entertain only a small random sample of queries at each node, constrain their complexity to increase with tree depth, and grow multiple trees. The terminal nodes are labeled by estimates of the corresponding posterior distribution over shape classes. An image is classified by sending it down every tree and aggregating the resulting distributions. The method is applied to classifying handwritten digits and synthetic linear and nonlinear deformations of three hundred symbols. State-of-the-art error rates are achieved on the National Institute of Standards and Technology database of digits. The principal goal of the experiments on symbols is to analyze invariance, generalization error and related issues, and a comparison with artificial neural networks methods is presented in this context. Figure 1: LATEX Symbol

[1]  Jerome H. Friedman,et al.  A Recursive Partitioning Decision Rule for Nonparametric Classification , 1977, IEEE Transactions on Computers.

[2]  Kunihiko Fukushima,et al.  Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position , 1982, Pattern Recognit..

[3]  Richard G. Casey,et al.  A Processor-Based OCR System , 1983, IBM J. Res. Dev..

[4]  George Nagy,et al.  Decision tree design using a probabilistic model , 1984, IEEE Trans. Inf. Theory.

[5]  Yehezkel Lamdan,et al.  Object recognition by affine invariant matching , 2011, Proceedings CVPR '88: The Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[7]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[8]  Seymour Shlien,et al.  Multiple binary decision tree classifiers , 1990, Pattern Recognit..

[9]  David A. Forsyth,et al.  Invariant Descriptors for 3D Object Recognition and Pose , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  K Fukushima,et al.  Handwritten alphanumeric character recognition by the neocognitron , 1991, IEEE Trans. Neural Networks.

[11]  James A. Pittman,et al.  Recognizing Hand-Printed Letters and Digits Using Backpropagation Learning , 1991, Neural Computation.

[12]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[13]  Paul J. Werbos,et al.  Links Between Artificial Neural Networks (ANN) and Statistical Pattern Recognition , 1991 .

[14]  Sargur N. Srihari,et al.  Bayesian and neural network pattern recognition: a theoretical connection and empirical results with handwritten characters , 1991 .

[15]  Ishwar K. Sethi,et al.  Decision tree performance enhancement using an artificial neural network implementation1 1This work was supported in part by NSF grant IRI-9002087 , 1991 .

[16]  Anil K. Jain,et al.  Small sample size problems in designing artificial neural networks , 1991 .

[17]  Alireza Khotanzad,et al.  Shape and Texture Recognition by a Neural Network , 1991 .

[18]  Gérard Dreyfus,et al.  Handwritten digit recognition by neural networks with single-layer training , 1992, IEEE Trans. Neural Networks.

[19]  Ching Y. Suen,et al.  Historical review of OCR research and development , 1992, Proc. IEEE.

[20]  Saul B. Gelfand,et al.  Classification trees with neural network feature extraction , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[22]  Amar Mitiche,et al.  Optical character recognition by a neural network , 1992, Neural Networks.

[23]  Patrick J. Grother,et al.  The First Census Optical Character Recognition Systems Conference | NIST , 1992 .

[24]  Thomas H. Reiss,et al.  Recognizing Planar Objects Using Invariant Image Features , 1993, Lecture Notes in Computer Science.

[25]  Donald E. Brown,et al.  A comparison of decision tree classifiers with backpropagation neural networks for multimodal classification problems , 1992, Pattern Recognit..

[26]  J.B. Burns,et al.  View Variation of Point-Set and Line-Segment Features , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  P M Gochin Properties of simulated neurons from a model of primate inferior temporal cortex. , 1994, Cerebral cortex.

[28]  Isabelle Guyon,et al.  Comparison of classifier methods: a case study in handwritten digit recognition , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[29]  Mansur R. Kabuka,et al.  A Novel Feature Recognition Neural Network and its Application to Character Recognition , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  M. Ito,et al.  Processing of contrast polarity of visual images in inferotemporal cortex of the macaque monkey. , 1994, Cerebral cortex.

[31]  Yann LeCun,et al.  Memory-based character recognition using a transformation invariant metric , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[32]  Thomas G. Dietterich,et al.  Error-Correcting Output Coding Corrects Bias and Variance , 1995, ICML.

[33]  Emanuele Trucco,et al.  Geometric Invariance in Computer Vision , 1995 .

[34]  Minami Ito,et al.  Size and position invariance of neuronal responses in monkey inferotemporal cortex. , 1995, Journal of neurophysiology.

[35]  George Nagy,et al.  Joint feature and classifier design for OCR , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[36]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[37]  R. Tibshirani,et al.  Penalized Discriminant Analysis , 1995 .

[38]  W. Singer,et al.  Long-range synchronization of oscillatory light responses in the cat retina and lateral geniculate nucleus , 1996, Nature.

[39]  C. Gilbert,et al.  Spatial integration and cortical dynamics. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Federico Girosi,et al.  On the Relationship between Generalization Error, Hypothesis Complexity, and Sample Complexity for Radial Basis Functions , 1996, Neural Computation.

[41]  Y. Amit Graphical shape templates for automatic anatomy detection with applications to MRI brain scans , 1997, IEEE Transactions on Medical Imaging.

[42]  Yali Amit,et al.  Joint Induction of Shape Features and Tree Classifiers , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Heinrich H Bülthoff,et al.  Image-based object recognition in man, monkey and machine , 1998, Cognition.

[44]  Tsuban Chen,et al.  The past, present, and future of image and multidimensional signal processing , 1998, IEEE Signal Process. Mag..

[45]  Yali Amit,et al.  A Computational Model for Visual Selection , 1999, Neural Computation.

[46]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.