Learning the Taxonomy and Models of Categories Present in Arbitrary Images

This paper proposes, and presents a solution to, the problem of simultaneous learning of multiple visual categories present in an arbitrary image set and their inter-category relationships. These relationships, also called their taxonomy, allow categories to be defined recursively, as spatial configurations of (simpler) subcategories each of which may be shared by many categories. Each image is represented by a segmentation tree, whose structure captures recursive embedding of image regions in a multiscale segmentation, and whose nodes contain the associated region properties. The presence of any occurring categories is reflected in the occurrence of associated, similar subtrees within the image trees. Similar subtrees across the entire image set are clustered. Each cluster corresponds to a discovered category, represented by the cluster properties. A (subcategory) cluster of small matching subtrees may occur within multiple clusters (categories) of larger matching subtrees, in different spatial relationships with subtrees from other small clusters. Such recursive embedding, grouping and intersection of clusters is captured in a directed acyclic graph (DAG) which represents the discovered taxonomy. Detection, recognition and segmentation of any of the learned categories present in a new image are simultaneously conducted by matching the segmentation tree of the new image with the learned DAG. This matching also yields a semantic explanation of the recognized category, in terms of the presence of its subcategories. Experiments with a newly compiled dataset of four-legged animals demonstrate good cross-category resolvability.

[1]  Narendra Ahuja,et al.  A Transform for Multiscale Image Segmentation by Integrated Edge and Region Detection , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[3]  Narendra Ahuja,et al.  Extracting Subimages of an Unknown Category from a Set of Images , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  Narendra Ahuja,et al.  Learning Multiscale Image Models of 2D Object Classes , 1998, ACCV.

[5]  Edwin R. Hancock,et al.  Computing approximate tree edit distance using relaxation labeling , 2003, Pattern Recognit. Lett..

[6]  Andrew Zisserman,et al.  A Boundary-Fragment-Model for Object Detection , 2006, ECCV.

[7]  Sanja Fidler,et al.  Hierarchical Statistical Learning of Generic Parts of Object Structure , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  Kaleem Siddiqi,et al.  Matching Hierarchical Structures Using Association Graphs , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[10]  Sven J. Dickinson,et al.  Learning Hierarchical Shape Models from Examples , 2005, EMMCVPR.

[11]  B. Schiele,et al.  Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .

[12]  Nebojsa Jojic,et al.  LOCUS: learning object classes with unsupervised segmentation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[13]  Stuart Geman,et al.  Context and Hierarchy in a Probabilistic Image Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[15]  Alexei A. Efros,et al.  Using Multiple Segmentations to Discover Objects and their Extent in Image Collections , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[16]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[17]  Michael C. Nechyba,et al.  Dynamic trees for unsupervised segmentation and matching of image regions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.