Finding objects by grouping primitives

We describe the use of a representation, called a body plan, to segment and to recognize people and animals in complex environments. The representation is an organized collection of grouping hints obtained from a combination of constraints on color and texture and constraints on geometric properties such as the structure of individual parts and the relationships, between parts. The approach is illustrated with two examples of programs that successfully use body plans for recognition: one example involves determining whether a picture contains a scantily clad human, using a body plan built by hand; the other involves determining whether a picture contains a horse, using a body plan learned from image data. In both cases, the system demonstrates excellent performance on large, uncontrolled test sets and very large and diverse control sets. The mechanism of recognition by assembly is very general; we describe previous work on finding clothing by marking folds and then assembling groups of folds.

[1]  Michael J. Swain,et al.  The capacity of color histogram indexing , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Andrew Zisserman,et al.  Applications of Invariance in Computer Vision , 1993, Lecture Notes in Computer Science.

[3]  C. R. Calladine,et al.  Theory of Shell Structures , 1983 .

[4]  David A. Forsyth,et al.  Invariant Descriptors for 3D Object Recognition and Pose , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Ramesh C. Jain,et al.  ImageGREP: fast visual pattern matching in image databases , 1997, Electronic Imaging.

[6]  Lawrence G. Roberts,et al.  Machine Perception of Three-Dimensional Solids , 1963, Outstanding Dissertations in the Computer Sciences.

[7]  David A. Forsyth,et al.  Finding Naked People , 1996, ECCV.

[8]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[9]  David C. Blair STAIRS redux: thoughts on the STAIRS evaluation, ten years after , 1996 .

[10]  David A. Forsyth,et al.  Learning to Find Pictures of People , 1998, NIPS.

[11]  Fang Liu,et al.  Real-time recognition with the entire Brodatz texture database , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[13]  Jean Ponce,et al.  Invariant Properties of Straight Homogeneous Generalized Cylinders and Their Contours , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Yali Amit,et al.  Joint Induction of Shape Features and Tree Classifiers , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  David G. Lowe,et al.  Three-Dimensional Object Recognition from Single Two-Dimensional Images , 1987, Artif. Intell..

[16]  Amarnath Gupta,et al.  Virage video engine , 1997, Electronic Imaging.

[17]  Tom Minka,et al.  Interactive learning with a "society of models" , 1997, Pattern Recognit..

[18]  Michael Stonebraker,et al.  Chabot: Retrieval from a Relational Database of Images , 1995, Computer.

[19]  David A. Forsyth,et al.  Class-based grouping in perspective images , 1995, Proceedings of IEEE International Conference on Computer Vision.

[20]  Fang Liu,et al.  Periodicity, Directionality, and Randomness: Wold Features for Image Modeling and Retrieval , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  D. W. Thompson,et al.  Three-dimensional model matching from an unconstrained viewpoint , 1987, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[22]  Michael J. Swain,et al.  Interactive indexing into image databases , 1993, Electronic Imaging.

[23]  M. Hebert,et al.  The Representation, Recognition, and Locating of 3-D Objects , 1986 .

[24]  Hayit Greenspan,et al.  Finding Pictures of Objects in Large Collections of Images , 1996, Object Representation in Computer Vision.

[25]  M. E. Maron,et al.  An evaluation of retrieval effectiveness for a full-text document-retrieval system , 1985, CACM.

[26]  Rosalind W. Picard,et al.  Texture orientation for sorting photos "at a glance" , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[27]  David A. Forsyth,et al.  Shape Representations from Shading Primitives , 1998, ECCV.

[28]  Gerard Salton,et al.  Another look at automatic text-retrieval systems , 1986, CACM.

[29]  W. Eric L. Grimson,et al.  Configuration based scene classification and image indexing , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Yali Amit,et al.  Shape Quantization and Recognition with Randomized Trees , 1997, Neural Computation.

[31]  David A. Forsyth,et al.  Identifying nude pictures , 1996, Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV'96.

[32]  Simone Santini,et al.  Similarity queries in image databases , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  W. Eric L. Grimson,et al.  Localizing Overlapping Parts by Searching the Interpretation Tree , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[35]  Andrew Zisserman,et al.  Geometric invariance in computer vision , 1992 .

[36]  David A. Forsyth,et al.  Body plans , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.