Co-recognition of Images and Videos: Unsupervised Matching of Identical Object Patterns and Its Applications

In this chapter, we address the problem of detecting, matching, and segmenting all identical object-level patterns from images or videos in an unsupervised way, called the “co-recognition” problem. In an unsupervised setting without any prior knowledge of specific target objects, it relies entirely on geometric and photometric relations of visual features. To solve this problem, a multi-layer match-growing framework is proposed which explores given visual data by intra-layer expansion and inter-layer merge. We demonstrate the effectiveness of this approach on identical object detection, image retrieval, symmetry detection, and action recognition. These applications will validate the usefulness of co-recognition to several vision problems.

[1]  Alexei A. Efros,et al.  Discovering object categories in image collections , 2005 .

[2]  Jiri Matas,et al.  Efficient Symmetry Detection Using Local Affine Frames , 2007, SCIA.

[3]  Thomas Serre,et al.  A Biologically Inspired System for Action Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[4]  Ying Wu,et al.  Spatial Random Partition for Common Visual Pattern Discovery , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Vladimir Kolmogorov,et al.  "GrabCut": interactive foreground extraction using iterated graph cuts , 2004, ACM Trans. Graph..

[6]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[7]  Minsu Cho,et al.  Co-recognition of Actions in Video Pairs , 2010, 2010 20th International Conference on Pattern Recognition.

[8]  C H Chen Emerging topics in computer vision and its applications , 2011 .

[9]  Andrew Blake,et al.  Cosegmentation of Image Pairs by Histogram Matching - Incorporating a Global Constraint into MRFs , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[11]  Narendra Ahuja,et al.  Unsupervised Category Modeling, Recognition, and Segmentation in Images , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Giovanni Marola,et al.  On the Detection of the Axes of Symmetry of Symmetric and Almost Symmetric Planar Images , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Stepán Obdrzálek,et al.  Object Recognition using Local Affine Frames on Distinguished Regions , 2002, BMVC.

[14]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16]  Minsu Cho,et al.  OBJECT CORRESPONDENCE NETWORKS FOR UNSUPERVISED RECOGNITION OF IDENTICAL OBJECTS , 2011 .

[17]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18]  Thomas Deselaers,et al.  What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[20]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[21]  Jianbo Shi,et al.  Image Matching via Saliency Region Correspondences , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Yanxi Liu,et al.  Performance evaluation of state-of-the-art discrete symmetry detection algorithms , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Sang Uk Lee,et al.  Nonparametric higher-order learning for interactive segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Juan Carlos Niebles,et al.  A Hierarchical Model of Shape and Appearance for Human Action Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Kyoung Mu Lee,et al.  Unsupervised detection and segmentation of identical objects , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[27]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  O. Faugeras Three-dimensional computer vision: a geometric viewpoint , 1993 .

[29]  Toby Sharp,et al.  Image segmentation with a bounding box prior , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[30]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[31]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[32]  Eli Shechtman,et al.  Space-time behavior based correlation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[33]  Stefano Soatto,et al.  Local Features, All Grown Up , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[34]  Yoel Shkolnisky,et al.  An algebraic approach to symmetry detection , 2004, ICPR 2004.

[35]  Martial Hebert,et al.  Event Detection in Crowded Videos , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[36]  Parris K. Egbert,et al.  Correspondence expansion for wide baseline stereo , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[37]  Zhenguo Li,et al.  Noise Robust Spectral Clustering , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[38]  Shimon Ullman,et al.  Unsupervised Classification and Part Localization by Consistency Amplification , 2008, ECCV.

[39]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[40]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Minsu Cho,et al.  Feature correspondence and deformable object matching via agglomerative correspondence clustering , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[42]  Luc Van Gool,et al.  Simultaneous Object Recognition and Segmentation from Single or Multiple Model Views , 2006, International Journal of Computer Vision.

[43]  Minsu Cho,et al.  Partially Occluded Object-Specific Segmentation in View-Based Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[45]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[46]  Thomas B. Moeslund,et al.  Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[48]  Long Quan,et al.  Match Propagation for Image-Based Modeling and Rendering , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[49]  Minsu Cho,et al.  Co-recognition of Image Pairs by Data-Driven Monte Carlo Image Exploration , 2008, ECCV.

[50]  Esa Rahtu,et al.  Object recognition and segmentation by non-rigid quasi-dense matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Michal Irani,et al.  Similarity by Composition , 2006, NIPS.

[52]  Jan-Olof Eklundh,et al.  Detecting Symmetry and Symmetric Constellations of Features , 2006, ECCV.

[53]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2008, International Journal of Computer Vision.

[55]  Steven M. Seitz,et al.  A Probabilistic Model for Object Recognition, Segmentation, and Non-Rigid Correspondence , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Harry Shum,et al.  Digital papercutting , 2005, SIGGRAPH '05.

[57]  Roman Filipovych,et al.  Learning human motion models from unsegmented videos , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.