Mixture-of-Parts Pictorial Structures for Objects with Variable Part Sets

For many multi-part object classes, the set of parts can vary not only in location but also in type. For example, player formations in American football involve various subsets of player types, and the spatial constraints among players depend largely upon which subset of player types constitutes the formation. In this work, we study the problem of localizing and classifying the parts of such objects. Pictorial structures provide an efficient and robust mechanism for localizing object parts. Unfortunately, these models assume that each object instance involves the same set of parts, making it difficult to apply them directly in our setting. With this motivation, we introduce the mixture-of-parts pictorial structure (MoPPS) model, which is characterized by three components: a set of available parts, a set of constraints that specify legal part subsets, and a function that returns a pictorial structure for any legal part subset. MoPPS inference corresponds to jointly computing the most likely subset of parts and their positions. We propose a restricted, but useful, representation for MoPPS models that facilitates inference via branch-and-bound optimization, which we show is efficient in practice. Experiments in the challenging domain of American football show the effectiveness of the model and inference procedure.

[1]  Stephen S. Intille,et al.  Visual recognition of multiagent action , 1999 .

[2]  Patrick Pérez,et al.  Color-Based Probabilistic Tracking , 2002, ECCV.

[3]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[5]  Pietro Perona,et al.  A sparse object category model for efficient learning and exhaustive recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Daniel P. Huttenlocher,et al.  Beyond trees: common-factor models for 2D human pose recovery , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[7]  Daniel P. Huttenlocher,et al.  Object Recognition by Combining Appearance and Geometry , 2006, Toward Category-Level Object Recognition.

[8]  Toward Learning Mixture-of-Parts Pictorial Structures , 2007 .

[9]  Alan Fern,et al.  Improved Video Registration using Non-Distinctive Local Image Features , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.