论文信息 - Object Recognition by Scene Alignment

Object Recognition by Scene Alignment

Current object recognition systems can only recognize a limited number of object categories; scaling up to many categories is the next challenge. We seek to build a system to recognize and localize many different object categories in complex scenes. We achieve this through a simple approach: by matching the input image, in an appropriate representation, to images in a large training set of labeled images. Due to regularities in object identities across similar scenes, the retrieved matches provide hypotheses for object identities and locations. We build a probabilistic model to transfer the labels from the retrieval set to the input image. We demonstrate the effectiveness of this approach and study algorithm component contributions using held-out test sets from the LabelMe database.

[1] David G. Lowe,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[2] Daniel P. Huttenlocher,et al. Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[3] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[4] Jitendra Malik,et al. Shape Matching and Object Recognition , 2006, Toward Category-Level Object Recognition.

[5] Radford M. Neal,et al. Density Modeling and Clustering Using Dirichlet Diffusion Trees , 2003 .

[6] Antonio Torralba,et al. Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[7] Pietro Perona,et al. Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[8] Luc Van Gool,et al. The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[9] Antonio Torralba,et al. Learning hierarchical models of scenes, objects, and parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[10] Alexei A. Efros,et al. Putting Objects in Perspective , 2006, CVPR.

[11] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[12] Michael I. Jordan,et al. Hierarchical Dirichlet Processes , 2006 .

[13] Antonio Torralba,et al. Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14] Andrea Vedaldi,et al. Objects in Context , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[15] Yee Whye Teh,et al. A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation , 2006, NIPS.

[16] Michael I. Jordan,et al. Nonparametric empirical Bayes for the Dirichlet process mixture model , 2006, Stat. Comput..

[17] R. Fergus,et al. Tiny images , 2007 .

[18] H. Ishwaran,et al. Exact and approximate sum representations for the Dirichlet process , 2002 .

[19] Alexei A. Efros,et al. Scene completion using millions of photographs , 2007, SIGGRAPH 2007.

[20] C. Fellbaum. An Electronic Lexical Database , 1998 .

[21] Antonio Torralba,et al. LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.