Finding objects by grouping primitives

We describe the use of a representation, called a body plan, to segment and to recognize people and animals in complex environments. The representation is an organized collection of grouping hints obtained from a combination of constraints on color and texture and constraints on geometric properties such as the structure of individual parts and the relationships, between parts. The approach is illustrated with two examples of programs that successfully use body plans for recognition: one example involves determining whether a picture contains a scantily clad human, using a body plan built by hand; the other involves determining whether a picture contains a horse, using a body plan learned from image data. In both cases, the system demonstrates excellent performance on large, uncontrolled test sets and very large and diverse control sets. The mechanism of recognition by assembly is very general; we describe previous work on finding clothing by marking folds and then assembling groups of folds.

[1]  Lawrence G. Roberts,et al.  Machine Perception of Three-Dimensional Solids , 1963, Outstanding Dissertations in the Computer Sciences.

[2]  C. R. Calladine,et al.  Theory of Shell Structures , 1983 .

[3]  M. E. Maron,et al.  An evaluation of retrieval effectiveness for a full-text document-retrieval system , 1985, CACM.

[4]  M. Hebert,et al.  The Representation, Recognition, and Locating of 3-D Objects , 1986 .

[5]  Gerard Salton,et al.  Another look at automatic text-retrieval systems , 1986, CACM.

[6]  W. Eric L. Grimson,et al.  Localizing Overlapping Parts by Searching the Interpretation Tree , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  D. W. Thompson,et al.  Three-dimensional model matching from an unconstrained viewpoint , 1987, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[8]  David G. Lowe,et al.  Three-Dimensional Object Recognition from Single Two-Dimensional Images , 1987, Artif. Intell..

[9]  Jean Ponce,et al.  Invariant Properties of Straight Homogeneous Generalized Cylinders and Their Contours , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  David A. Forsyth,et al.  Invariant Descriptors for 3D Object Recognition and Pose , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Andrew Zisserman,et al.  Geometric invariance in computer vision , 1992 .

[12]  Fang Liu,et al.  Real-time recognition with the entire Brodatz texture database , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Michael J. Swain,et al.  Interactive indexing into image databases , 1993, Electronic Imaging.

[14]  Michael J. Swain,et al.  The capacity of color histogram indexing , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Rosalind W. Picard,et al.  Texture orientation for sorting photos "at a glance" , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[16]  David A. Forsyth,et al.  Class-based grouping in perspective images , 1995, Proceedings of IEEE International Conference on Computer Vision.

[17]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[18]  Michael Stonebraker,et al.  Chabot: Retrieval from a Relational Database of Images , 1995, Computer.

[19]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[20]  David A. Forsyth,et al.  Finding Naked People , 1996, ECCV.

[21]  Hayit Greenspan,et al.  Finding Pictures of Objects in Large Collections of Images , 1996, Object Representation in Computer Vision.

[22]  Simone Santini,et al.  Similarity queries in image databases , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  David A. Forsyth,et al.  Identifying nude pictures , 1996, Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV'96.

[24]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[25]  David C. Blair STAIRS redux: thoughts on the STAIRS evaluation, ten years after , 1996 .

[26]  Fang Liu,et al.  Periodicity, Directionality, and Randomness: Wold Features for Image Modeling and Retrieval , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  David A. Forsyth,et al.  Body plans , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  Ramesh C. Jain,et al.  ImageGREP: fast visual pattern matching in image databases , 1997, Electronic Imaging.

[29]  Tom Minka,et al.  Interactive learning with a "society of models" , 1997, Pattern Recognit..

[30]  Amarnath Gupta,et al.  Virage video engine , 1997, Electronic Imaging.

[31]  Yali Amit,et al.  Joint Induction of Shape Features and Tree Classifiers , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  W. Eric L. Grimson,et al.  Configuration based scene classification and image indexing , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Yali Amit,et al.  Shape Quantization and Recognition with Randomized Trees , 1997, Neural Computation.

[34]  David A. Forsyth,et al.  Shape Representations from Shading Primitives , 1998, ECCV.

[35]  David A. Forsyth,et al.  Learning to Find Pictures of People , 1998, NIPS.